A simulated pseudo-bulk RNA-seq dataset with 288 rows covering six immune cell types, eight canonical marker genes, two conditions (Healthy / Disease), and three biological replicates per condition. Marker genes are strongly expressed in their canonical cell type; Disease replicates include a simulated ~1.2 log2FC upregulation for marker genes, making biological comparisons visually informative.
Format
A data frame with 288 rows and 7 columns:
- cell_type
Immune cell type (factor: CD4 T, CD8 T, B Cell, NK Cell, Monocyte, pDC)
- gene
Gene symbol (factor: CD3D, CD8A, MS4A1, NKG7, LYZ, LILRA4, CD14, GNLY)
- condition
Experimental condition (factor: Healthy, Disease)
- replicate
Biological replicate (factor: Rep1, Rep2, Rep3)
- log2_cpm
Simulated log2 counts-per-million expression value
- avg_expression
Mean log2_cpm across replicates for this cell_type \(\times\) gene \(\times\) condition
- neg_log10_pval
Simulated \(-\log_{10}(p)\) value for differential expression summaries
Details
The dataset is designed to simultaneously support three VizModules plot types:
DotPlot — summarised
avg_expressionandpct_expressedcolumns per cell type \(\times\) gene \(\times\) condition combination.yPlot — per-replicate
log2_cpmvalues grouped bycell_typeand coloured bycondition.DensityPlot — per-replicate
log2_cpmvalues grouped byconditionand faceted bycell_type.