QCinfo: Extract QC information

View source: R/QCinfo.R

QCinfoR Documentation

Extract QC information

Description

Extract information for data quanlity control: detection P values, number of beads and averaged bisulfite conversion intensity. The function can also identify low quality samples and probes, as well as outlier samples in total intensity or beta value distribution.

Usage

QCinfo(rgSet, detPthre=0.000001, detPtype="negative", nbthre=3, samplethre=0.05,
       CpGthre=0.05, bisulthre=NULL, outlier=TRUE, distplot=TRUE)

Arguments

rgSet

An object of class rgDataSet, or RGChannelSetExtended

detPthre

Detection P value threshold to identify low quality data point

detPtype

Calculate detection P values based on negtive internal control ("negative") probes or out of the band ("oob") probes

nbthre

Number of bead threshold to identify data point of low quality

samplethre

Threshold to identify samples with low data quality, the percentage of low quality methylation data points across probes for each sample

CpGthre

Threshold to identify probes with low data quality, percentage of low quality methylation data points across samples for each probe

bisulthre

Threshold of bisulfite intensity for identification of low quality samples. By default, Mean - 3 x SD of sample bisufite control intensities will be used as a threshold.

outlier

If TRUE, outlier samples in total intensity or beta value distribution will be idenfied and classified as bad samples.

distplot

TRUE or FALSE, whether to produce beta value distribution plots before and after QC.

Value

detP: a matrix of detection P values

nbead: a matrix for number of beads

bisul: a vector of averaged intensities for bisulfite conversion controls per sample

badsample: a list of low quality or outlier samples

badCpG: a list of low quality CpGs

outlier_sample: a list of outlier samples in methylation beta value or totol intensity distribution.

Figure "qc_sample.jpg": scatter plot of Percent of low quality data per sample vs. Average bisulfite conversion intensity

Figure "qc_CpG.jpg": histogram for Percent of low quality data per CpG.

Figure "freqpolygon_beta_beforeQC.jpg": distribution plot of input data, samples colored in red are "bad" samples, list in badsample, including samples with low data quality and outlier in methylaiton beta value or total intensity.

Figure "freqpolygon_beta_afterQC.jpg": distribution plot input data after filtering "bad" samples.

Author(s)

Zongli Xu

References

Zongli Xu, Liang Niu, Leping Li and Jack A. Taylor, ENmix: a novel background correction method for Illumina HumanMethylation450 BeadChip. Nucleic Acids Research 2015.

Examples



if (require(minfiData)) {
#rgDataSet as input
path <- file.path(find.package("minfiData"),"extdata")
rgSet <- readidat(path = path,recursive = TRUE)
qc=QCinfo(rgSet)

#RGChannelSetExtended as input
sheet <- read.metharray.sheet(file.path(find.package("minfiData"),"extdata"),
 pattern = "csv$")
rgSet <- read.metharray.exp(targets = sheet,extended = TRUE)
qc<-QCinfo(rgSet)
}

xuz1/ENmix documentation built on Nov. 24, 2024, 4:31 a.m.