QC: Quality control of input datasets.

QCR Documentation

Quality control of input datasets.

Description

Quality Control (QC) is a step in evaluating the experiment design. For all two-dimension high throughput data, the t-SNE plot is firstly used to evaluate whether features are sufficient to separate positive and negative controls. The SSMD score (See reference Zhang) is further generated for each readout to evaluate the percentage of high-quality readouts.

Usage

QC(countMat, negGene, posGene)

Arguments

countMat

input data set. The siRNA/gene x readouts matrix from HTS2 or large-scale RNAi screens

negGene

negative control data set, the siRNAs/genes used as negative controls in screening.

posGene

positive control data set, the siRNAs/genes used as positive controls in screening.

Value

A list of plots, and their names are 'score_q', 'tSNE_QC', 'QC_box' and 'QC_SSMD'. 'tSNE_QC' is the global evaluation based on all the readouts. This figure can evaluate whether the positive and negative samples are well separated based on current all readouts. And the other 3 plots are the quality evaluation of the individual readouts.

Author(s)

Yajing Hao, Shuyang Zhang, Junhui Li, Guofeng Zhao, Xiang-Dong Fu

References

Laurens van der Maaten GH: Visualizing Data using t-SNE. JournalofMachineLearningResearch 2008,9(2008):2579-2605.

Zhang XD: A pair of new statistical parameters for quality control in RNA interference high-throughput screening assays. Genomics 2007, 89:552-561.

Examples


data(countMat)
data(negGene)
data(posGene)
QC(countMat,negGene,posGene)


ZetaSuite documentation built on May 25, 2022, 9:05 a.m.