Implements gene-set analysis methods.

Share:

Description

This function implements the gene-set analysis methods. It returns a data-frame with p-values and q-values for all the methods selected.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
cma.set.stat(cma.alter,
                     cma.cov,
                     cma.samp,
                     GeneSets,
                     ID2name=NULL, 
		     Scores,
		     passenger.rates = t(data.frame(0.55*rep(1.0e-6,25))),
		     BH = TRUE,
		     gene.method = FALSE, 
		     perm.null.method = TRUE, 
		     perm.null.het.method = FALSE,
		     pass.null.method = FALSE, 
		     pass.null.het.method = FALSE,
                     score = "logLRT",
                     verbose = TRUE)

Arguments

cma.alter

Data frame with somatic mutation information, broken down by gene, sample, screen, and mutation type. See GeneAlterBreast for an example.

cma.cov

Data frame with the total number of nucleotides "at risk" ("coverage"), broken down by gene, screen, and mutation type. See GeneCovBreast for an example.

cma.samp

Data frame with the number of samples analyzed, broken down by gene and screen. See GeneSampBreast for an example.

GeneSets

An object which annotates genes to gene-sets; it can either be a list with each component representing a set, or an object of the class AnnDbBimap.

ID2name

Vector mapping the gene identifiers used in the GeneSets object to the gene names used in the other objects; if they are the same, this parameter is not needed. See EntrezID2Name for an example.

Scores

Data frame of gene scores. The logLRT scores are used for the gene.method option. It can be the output of cma.scores. If the gene.method option is set to FALSE, this parameter is not needed.

passenger.rates

Data frame with 1 row and 25 columns, of passenger mutation rates per nucleotide, by type, or "context". Columns denote types and must be in the same order as the first 25 columns in the MutationsBrain objects.

BH

If set to TRUE, uses the Benjamini-Hochberg method to get q-values; if set to FALSE, uses the Storey method from the qvalue package.

gene.method

If set to TRUE, implements gene-oriented method.

perm.null.method

If set to TRUE, implements patient-oriented method with permutation null and no heterogeneity.

perm.null.het.method

If set to TRUE, implements patient-oriented method with permutation null and heterogeneity.

pass.null.method

If set to TRUE, implements patient-oriented method with passenger null and no heterogeneity.

pass.null.het.method

If set to TRUE, implements patient-oriented method with passenger null and heterogeneity.

score

Can be any of the scores which result from cma.scores. Specifies the gene-scoring mechanism used in the gene-oriented method.

verbose

If TRUE, prints intermediate messages.

Value

A data frame, with the rows representing set names and the columns representing the p-values and q-values corresponding to the different methods.

Author(s)

Simina M. Boca, Giovanni Parmigiani, Luigi Marchionni, Michael A. Newton.

References

Boca SM, Kinzler KW, Velculescu VE, Vogelstein B, Parmigiani G. Patient-oriented gene-set analysis for cancer mutation data. Genome Biology. DOI: 10.1186/gb-2010-11-11-r112

Parmigiani G, Lin J, Boca S, Sjoeblom T, Kinzler KW, Velculescu VE, Vogelstein B. Statistical methods for the analysis of cancer genome sequencing data. http://www.bepress.com/jhubiostat/paper126/

Benjamini Y and Hochberg Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. Journal of the Royal Statistical Society B. DOI: 10.2307/2346101

Storey JD and Tibshirani R. Statistical significance for genome-wide experiments. Proceedings of the National Academy of Sciences. DOI: 10.1073/pnas.1530509100

Schaeffer EM, Marchionni L, Huang Z, Simons B, Blackman A, Yu W, Parmigiani G, Berman DM. Androgen-induced programs for prostate epithelial growth and invasion arise in embryogenesis and are reactivated in cancer. Oncogene. DOI: 10.1038/onc.2008.327

Thomas MA, Taub AE. Calculating binomial probabilities when the trial probabilities are unequal. Journal of Statistical Computation and Simulation. DOI: 10.1080/00949658208810534

Parsons DW, Jones S, Zhang X, Lin JCH, Leary RJ, Angenendt P, Mankoo P, Carter H, Siu I, et al. An Integrated Genomic Analysis of Human Glioblastoma Multiforme. Science. DOI: 10.1126/science.1164382

Wood LD, Parsons DW, Jones S, Lin J, Sjoeblom, Leary RJ, Shen D, Boca SM, Barber T, Ptak J, et al. The Genomic Landscapes of Human Breast and Colorectal Cancer. Science. DOI: 10.1126/science.1145720

See Also

GeneCov, GeneSamp, GeneAlter, BackRates, cma.scores, cma.set.sim

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
library(KEGG.db)
data(ParsonsGBM08)
data(EntrezID2Name)

setIDs <- c("hsa00250", "hsa05213")
SetResults <- cma.set.stat(cma.alter = GeneAlterGBM,
                                   cma.cov = GeneCovGBM,
                                   cma.samp = GeneSampGBM,
                                   GeneSets =  KEGGPATHID2EXTID[setIDs],
                                   ID2name = EntrezID2Name,
                                   perm.null.method = TRUE,
                                   pass.null.method = TRUE)

SetResults

Want to suggest features or report bugs for rdrr.io? Use the GitHub issue tracker.