| QC | R Documentation |
This function performs comprehensive quality control analysis on high-throughput screening data to evaluate experimental design and data quality. It generates multiple diagnostic plots and calculates SSMD (Strictly Standardized Mean Difference) scores to assess the separation between positive and negative controls.
QC(countMat, negGene, posGene)
countMat |
A matrix of raw count data where rows represent genes/siRNAs and columns represent readouts/conditions. The matrix should have row names corresponding to gene/siRNA identifiers. |
negGene |
A data frame or matrix containing negative control gene/siRNA identifiers. The first column should contain gene/siRNA names that match the row names in countMat. |
posGene |
A data frame or matrix containing positive control gene/siRNA identifiers. The first column should contain gene/siRNA names that match the row names in countMat. |
The function performs the following quality control analyses:
Creates jitter plots to visualize score distributions across readouts
Performs t-SNE dimensionality reduction to assess global sample separation
Generates boxplots to compare score distributions between control groups
Calculates SSMD scores for each readout: \mathrm{SSMD} = (\mu_{pos} - \mu_{neg}) / \sqrt{\sigma_{pos}^2 + \sigma_{neg}^2}
Reports the percentage of readouts with |\mathrm{SSMD}| \ge 2 (considered high quality)
SSMD scores \ge 2 indicate good separation between positive and negative controls, suggesting high-quality readouts.
A list containing four diagnostic plots:
score_qc |
A jitter plot showing the distribution of raw scores across all readouts for positive and negative controls |
tSNE_QC |
A t-SNE plot showing the global separation of positive and negative control samples in 2D space |
QC_box |
Side-by-side boxplots showing the distribution of scores for positive and negative controls across all readouts |
QC_SSMD |
A density plot showing the distribution of SSMD scores across readouts, with a threshold line at SSMD=2 and the percentage of high-quality readouts displayed |
Yajing Hao, Shuyang Zhang, Junhui Li, Guofeng Zhao, Xiang-Dong Fu
Laurens van der Maaten & Geoffrey Hinton: Visualizing Data using t-SNE. Journal of Machine Learning Research 2008, 9(2008):2579-2605.
Zhang XD: A pair of new statistical parameters for quality control in RNA interference high-throughput screening assays. Genomics 2007, 89:552-561.
data(countMat)
data(negGene)
data(posGene)
QC(countMat, negGene, posGene)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.