RunDESeq2 performs a high-level differential gene expression analysis with DESeq2. It, along with ProcessDEGs, form the crux of the RNA-seq portion of this package.

RunDESeq2(outpath, quants.path, samplesheet, tx2gene, level,
  padj.thresh = 0.05, fc.thresh = 2, plot.annos = NULL,
  plot.box = TRUE, plot.enrich = TRUE,
  enrich.libs = c("GO_Molecular_Function_2018",
  "GO_Cellular_Component_2018", "GO_Biological_Process_2018",
  "KEGG_2019_Human", "Reactome_2016", "BioCarta_2016", "Panther_2016"),
  top.n = 100, block = NULL, count.filt = 10)

Arguments

outpath

Path to directory to be used for output. Additional directories will be generated within this folder.

quants.path

Path to directory containing a directory for each sample with a salmon-generated quant.sf file inside.

samplesheet

Path to samplesheet containing sample metadata.

tx2gene

Path to file with transcript IDs in first column and gene identifiers in second. Used for gene-level summarization.

level

String defining variable of interest.

padj.thresh

Number or numeric scalar indicating the adjusted p-value cutoff(s) to be used for determining "significant" differential expression. If multiple are given, multiple tables/plots will be generated using all combinations of padj.thresh and fc.thresh.

fc.thresh

Number or numeric scalar indicating the log2 fold-change cutoff(s) to be used for determining "significant" differential expression. If multiple are given, multiple tables/plots will be generated using all combinations of padj.thresh and fc.thresh.

plot.annos

String or character vector defining the column(s) in samplesheet to use to annotate figures.

plot.box

Boolean indicating whether box plots for DEGs should be created for each comparison. If so, the top.n genes will be plotted. This step is quite time-consuming with many genes.

plot.enrich

Boolean indicating whether enrichment analyses for DEGs should be run and plotted for each comparison.

enrich.libs

Vector of valid enrichR libraries to test the genes against.

Available libraries can be viewed with listEnrichrDbs from the enrichR package.

top.n

Number of differentially expressed genes to create boxplots for, ranked by adj. p-value after applying padj.thresh and fc.thresh thresholds. If multiple thresholds are provided, the lowest fold-change and highest adj. p-value thresholds will be used.

block

String or character vector defining the column(s) in samplesheet to use to block for unwanted variance, e.g. batch or technical effects.

count.filt

Number indicating read threshold. Genes with fewer counts than this number summed across all samples will be removed from the analysis.

Value

Named List containing a list named 'res.list' containing DESeqResults objects for all comparisons generated by ProcessDEGs, a DESeqDataSet object named 'dds' from running DESeq, a RangedSummarizedExperiment object named 'rld' from running rlog, and a RangedSummarizedExperiment object from running vst named 'vsd'.

Details

The default parameters should be an adequate starting place for most users, but lazy folks can provide multiple thresholds to padj.thresh and/or fc.thresh if they aren't sure how stringent or lenient they need to be with their data.

It's generally best to provide an empty directory as the output path, as several directories will be generated.

See also

DESeq, for more about differential expression analysis. ProcessDEGs, for analyzing and visualizing the results.