Description Usage Arguments Value Author(s) References See Also Examples
MetaQC implements our proposed quantitative quality control measures: (1) internal homogeneity of co-expression structure among studies (internal quality control; IQC); (2) external consistency of co-expression structure correlating with pathway database (external quality control; EQC); (3) accuracy of differentially expressed gene detection (accuracy quality control; AQCg) or pathway identification (AQCp); (4) consistency of differential expression ranking in genes (consistency quality control; CQCg) or pathways (CQCp). (See the reference for detailed explanation.) For each quality control index, the p-values from statistical hypothesis testing are minus log transformed and PCA biplots were applied to assist visualization and decision. Results generate systematic suggestions to exclude problematic studies in microarray meta-analysis and potentially can be extended to GWAS or other types of genomic meta-analysis. The identified problematic studies can be scrutinized to identify technical and biological causes (e.g. sample size, platform, tissue collection, preprocessing etc) of their bad quality or irreproducibility for final inclusion/exclusion decision.
1 2 3 4 |
DList |
Either a list of all data matrices (Case 1) or a list of lists (Case 2); The first case is simplified input data structure only for two classes comparison. Each data name should be set as the name of each list element. Each data should be a numeric matrix that has genes in the rows and samples in the columns. Row names should be official gene symbols and column names be class labels. For the full description of input data, you can use the second data format. Each data is represented as a list which should have x, y, and geneid (geneid can be replaced to row names of matrix x) elements, representing expression data, outcome or class labels, and gene ids, respectively. Additionally, in the survival analysis, censoring.status should be set. |
GList |
The location of a file which has sets of gene symbol lists such as gmt files. By default, the gmt file will be converted to list object and saved with the same name with ".rda". Alternatively, a list of gene sets is allowed; the name of each element of the list should be set as a unique pathway name, and each pathway should have a character vector of gene symbols. |
isParallel |
Whether to use multiple cores in parallel for fast computing. By default, it is false. |
nCores |
When isParallel is true, the number of cores can be set. By default, all cores in the machine are used in the unix-like machine, and 2 cores are used in windows. |
useCache |
Whether imported gmt file should be saved for the next use. By default, it is true. |
filterGenes |
Whether to use gene filtering (recommended). |
maxNApctAllowed |
Filtering out genes which have missing values more than specified ratio (Default .3). Applied if filterGenes is TRUE. |
cutRatioByMean |
Filtering out specified ratio of genes which have least expression value (Default .4). Applied if filterGenes is TRUE. |
cutRatioByVar |
Filtering out specified ratio of genes which have least sample wise expression variance (Default .4). Applied if filterGenes is TRUE. |
minNumGenes |
Mininum number of genes in a pathway. A pathway which has members smaller than the specified value will be removed. |
verbose |
Whether to print out logs. |
resp.type |
The type of response variable. Three options are: "Twoclass" (unpaired), "Multiclass", "Survival." By default, Twoclass is used |
A proto R object. Use RunQC function to run QC procedure. Use Plot function to plot PCA figure. Use Print function to view various information. See examples below.
Don Kang (donkang75@gmail.com) and George Tseng (ctseng@pitt.edu)
Dongwan D. Kang, Etienne Sibille, Naftali Kaminski, and George C. Tseng. (Nucleic Acids Res. 2012) MetaQC: Objective Quality Control and Inclusion/Exclusion Criteria for Genomic Meta-Analysis.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 | ## Not run:
requireAll(c("proto", "foreach"))
## Toy Example
data(brain) #already hugely filtered
#Two default gmt files are automatically downloaded,
#otherwise it is required to locate it correctly.
#Refer to http://www.broadinstitute.org/gsea/downloads.jsp
brainQC <- MetaQC(brain, "c2.cp.biocarta.v3.0.symbols.gmt",
filterGenes=FALSE, verbose=TRUE)
#B is recommended to be >= 1e4 in real application
runQC(brainQC, B=1e2, fileForCQCp="c2.all.v3.0.symbols.gmt")
brainQC
plot(brainQC)
## For parallel computation with only 2 cores
## R >= 2.14.0 in windows to use parallel computing
brainQC <- MetaQC(brain, "c2.cp.biocarta.v3.0.symbols.gmt",
filterGenes=FALSE, verbose=TRUE, isParallel=TRUE, nCores=2)
#B is recommended to be >= 1e4 in real application
runQC(brainQC, B=1e2, fileForCQCp="c2.all.v3.0.symbols.gmt")
plot(brainQC)
## For parallel computation with half cores
## In windows, only 3 cores are used if not specified explicitly
brainQC <- MetaQC(brain, "c2.cp.biocarta.v3.0.symbols.gmt",
filterGenes=FALSE, verbose=TRUE, isParallel=TRUE)
#B is recommended to be >= 1e4 in real application
runQC(brainQC, B=1e2, fileForCQCp="c2.all.v3.0.symbols.gmt")
plot(brainQC)
## Real Example which is used in the paper
#download the brainFull file
#from https://github.com/downloads/donkang75/MetaQC/brainFull.rda
load("brainFull.rda")
brainQC <- MetaQC(brainFull, "c2.cp.biocarta.v3.0.symbols.gmt", filterGenes=TRUE,
verbose=TRUE, isParallel=TRUE)
runQC(brainQC, B=1e4, fileForCQCp="c2.all.v3.0.symbols.gmt") #B was 1e5 in the paper
plot(brainQC)
## Survival Data Example
#download Breast data
#from https://github.com/downloads/donkang75/MetaQC/Breast.rda
load("Breast.rda")
breastQC <- MetaQC(Breast, "c2.cp.biocarta.v3.0.symbols.gmt", filterGenes=FALSE,
verbose=TRUE, isParallel=TRUE, resp.type="Survival")
runQC(breastQC, B=1e4, fileForCQCp="c2.all.v3.0.symbols.gmt")
breastQC
plot(breastQC)
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.