Skip to contents

A simulated pseudo-bulk RNA-seq dataset with 288 rows covering six immune cell types, eight canonical marker genes, two conditions (Healthy / Disease), and three biological replicates per condition. Marker genes are strongly expressed in their canonical cell type; Disease replicates include a simulated ~1.2 log2FC upregulation for marker genes, making biological comparisons visually informative.

Usage

example_rnaseq

Format

A data frame with 288 rows and 7 columns:

cell_type

Immune cell type (factor: CD4 T, CD8 T, B Cell, NK Cell, Monocyte, pDC)

gene

Gene symbol (factor: CD3D, CD8A, MS4A1, NKG7, LYZ, LILRA4, CD14, GNLY)

condition

Experimental condition (factor: Healthy, Disease)

replicate

Biological replicate (factor: Rep1, Rep2, Rep3)

log2_cpm

Simulated log2 counts-per-million expression value

avg_expression

Mean log2_cpm across replicates for this cell_type \(\times\) gene \(\times\) condition

neg_log10_pval

Simulated \(-\log_{10}(p)\) value for differential expression summaries

Source

Simulated in data-raw/generate_example_data.R.

Details

The dataset is designed to simultaneously support three VizModules plot types:

  • DotPlot — summarised avg_expression and pct_expressed columns per cell type \(\times\) gene \(\times\) condition combination.

  • yPlot — per-replicate log2_cpm values grouped by cell_type and coloured by condition.

  • DensityPlot — per-replicate log2_cpm values grouped by condition and faceted by cell_type.

Author

Jacob Martin