Quality control procedure for depth of coverage

Share:

Description

Applies a quality control procedure to the depth of coverage matrix both sample-wise and exon-wise before normalization.

Usage

1
2
qc(Y, sampname, chr, ref, mapp, gc,cov_thresh,length_thresh,mapp_thresh,
  gc_thresh)

Arguments

Y

Original read depth matrix returned from getcoverage

sampname

Vector of sample names returned from getbambed

chr

Chromosome.

ref

IRanges object specifying exonic positions returned from getbambed

mapp

Vector of mappability for each exon returned from getmapp

gc

Vector of GC content for each exon returned from getgc

cov_thresh

Vector specifying the upper and lower bound of exonic median coverage threshold for QC. 20-4000 recommended.

length_thresh

Vector specifying the upper and lower bound of exonic length threshold for QC. 20-2000 recommended.

mapp_thresh

Scalar variable specifying exonic mappability threshold for QC. 0.9 recommended.

gc_thresh

Vector specifying the upper and lower bound of exonic GC content threshold for QC. 20-80 recommended.

Details

It is suggested that analysis by CODEX be carried out in a batch-wise fashion if multiple batches exist. CODEX further filters out exons that: have extremely low coverage–median read depth across all samples less than 20 or greater than 4000; are extremely short–less than 20 bp; are extremely hard to map– mappability less than 0.9; have extreme GC content–less than 20 or greater than 80. The above filtering thresholds are recommended and can be user-defined to be adapted to different sequencing protocols.

Value

Y_qc

Updated Y after QC

sampname_qc

Updated sampname after QC

gc_qc

Updated gc after QC

mapp_qc

Updated mapp after QC

ref_qc

Updated ref after QC

qcmat

Matrix specifying results of exon-wise QC procedures

Author(s)

Yuchao Jiang yuchaoj@wharton.upenn.edu

See Also

getbambed, getgc, getmapp

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
Y <- coverageObjDemo$Y
sampname <- bambedObjDemo$sampname
chr <- bambedObjDemo$chr
ref <- bambedObjDemo$ref
gc <- gcDemo
mapp <- mappDemo
cov_thresh <- c(20, 4000)
length_thresh <- c(20, 2000)
mapp_thresh <- 0.9
gc_thresh <- c(20, 80)
qcObj <- qc(Y, sampname, chr, ref, mapp, gc, cov_thresh, length_thresh, 
    mapp_thresh, gc_thresh)
Y_qc <- qcObj$Y_qc
sampname_qc <- qcObj$sampname_qc
gc_qc <- qcObj$gc_qc
mapp_qc <- qcObj$mapp_qc
ref_qc <- qcObj$ref_qc
qcmat <- qcObj$qcmat