RunDiffBind
performs a high-level differential binding analysis
with DiffBind
. It, along with ProcessDBRs, form the crux of the
ChIP-seq portion of this package.
RunDiffBind(outpath, samplesheet, txdb, dba = NULL, level = c("Treatment", "Condition", "Tissue", "Factor"), se = NULL, fdr.thresh = 0.05, fc.thresh = 2, block = NULL, heatmap.colors = NULL, heatmap.preset = NULL, reverse = FALSE, n.consensus = 2, breaks = c(seq(-3, -1.0001, length = 250), seq(-1, -0.1, length = 250), seq(-0.0999, 0.0999, length = 1), seq(0.1, 1, length = 250), seq(1.0001, 3, length = 250)), plot.enrich = TRUE, enrich.libs = c("GO_Molecular_Function_2018", "GO_Cellular_Component_2018", "GO_Biological_Process_2018", "KEGG_2019_Human", "Reactome_2016", "BioCarta_2016", "Panther_2016"), promoters = c(-2000, 2000), method = c("DESeq2", "edgeR"), scale.full = TRUE, flank.anno = TRUE, flank.dist = 5000)
outpath | Path to directory to be used for output. Additional directories will be generated within this folder. |
---|---|
samplesheet | Path to samplesheet containing sample metadata. |
txdb |
|
dba | DBA object as returned by |
level | String defining variable of interest from |
se | Path to file containing consensus SEs, which will be used to to annotate whether individual peaks fall within an SE or not. |
fdr.thresh | Number or numeric scalar indicating the false discovery
rate (FDR) cutoff(s) to be used for determining "significant" differential
binding. If multiple are given, multiple tables/plots will be generated
using all combinations of |
fc.thresh | Number or numeric scalar indicating the log2 fold-change
cutoff(s) to be used for determining "significant" differential binding.
If multiple are given, multiple tables/plots will be generated using all
combinations of |
block | String or character vector defining the column(s) in
|
heatmap.colors | Character vector containing custom colors to use for
heatmaps in hex (e.g. |
heatmap.preset | String indicating which of the color presets to use in heatmaps. Available presets (low to high) are:
|
reverse | Boolean indicating whether to flip heatmap color scheme (high color will become low, etc). |
n.consensus | Number of samples in which peaks must overlap for the peaks to be merged and included in the consensus peak set. |
breaks | Vector of sequences to be used as breaks for signal heatmaps. |
plot.enrich | Boolean indicating whether enrichment analyses for DBRs should be run and plotted for each comparison. |
enrich.libs | Vector of valid Available libraries can be viewed with
|
promoters | Scalar vector containing how many basepairs up and downstream of the TSS should be used to define gene promoters. |
method | String indicating method to be used for differential expression analysis. Can be "DESeq2" or "edgeR". |
scale.full | Boolean indicating whether the full library size (total
number of reads) for each sample is used for scaling normalization. If
|
flank.anno | Boolean indicating whether flanking gene information for each peak should be retrieved. Useful for broad peaks and super enhancers. |
flank.distance | Integer for distance from edges of peak to search for
flanking genes. Ignored if |
A DBA
object from dba.analyze
.
The default parameters should be an adequate starting place for most users,
but lazy folks can provide multiple thresholds to fdr.thresh
and/or fc.thresh
if they aren't sure how stringent or lenient they
need to be with their data.
Providing the resulting DBA object as input to this function can be useful
when running multiple times with different level
s and block
s
or thresholds, as it skips bam loading, which is by far the most
time-intensive part.
It's generally best to provide an empty directory as the output path, as several directories will be generated.
dba
, dba.count
,
dba.contrast
, dba.analyze
,
dba.report
for more about ChIP-seq differential
binding analysis.
ProcessDBRs
, for analyzing and visualizing the results.