calculatePEnrichment: Compute P_enrichment
In saps: Significance Analysis of Prognostic Signatures

Description Usage Arguments Value References See Also Examples

This function performs a pre-ranked gene set enrichment analysis (GSEA) to evaluate the degree to which a candidate gene set is overrepresented at the top or bottom extremes of a ranked list of concordance indices. This function is normally called by saps.

1	calculatePEnrichment(rankedGenes, candidateGeneSet, cpus, gsea.perm = 1000)

`rankedGenes`	An nx1 matrix of concordance indices for n genes. Generally this will be the z-score returned by `rankConcordance`. The row names should contain gene identifiers.
`candidateGeneSet`	A 1xp matrix of p gene identifiers. The row name should contain a name for the gene set.
`cpus`	This value is passed to the `runGSA` function in the piano package. For multi-core CPUs, this value should be set to the number of cores (which will significantly improve the computational time).
`gsea.perm`	The number of permutations to be used in the GSEA. This value is passed to `runGSA`.

The function returns a matrix with the following columns:

`P_enrichment`	the enrichment score
`direction`	either 1 or -1 depending on the direction of association

Beck AH, Knoblauch NW, Hefti MM, Kaplan J, Schnitt SJ, et al. (2013) Significance Analysis of Prognostic Signatures. PLoS Comput Biol 9(1): e1002875.doi:10.1371/journal.pcbi.1002875

Subramanian A, Tamayo P, Mootha VK, Mukherjee S, Ebert BL, et al. (2005) Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc Natl Acad Sci USA 102: 15545-15550.

saps runGSA

# 25 patients, none lost to followup
followup <- rep(1, 25)

# first 5 patients have good survival (in days)
time <- c(25, 27, 24, 21, 26, sample(1:3, 20, TRUE))*365

# create data for 100 genes, 25 patients
dat <- matrix(rnorm(25*100), nrow=25, ncol=100)
colnames(dat) <- as.character(1:100)

# create two random genesets of 5 genes each
set1 <- sample(colnames(dat), 5)
set2 <- sample(colnames(dat), 5)

genesets <- rbind(set1, set2)

# tweak data for first 5 patients for set1
dat[1:5, set1] <- dat[1:5, set1]+10

# rank all genes by concordance index
ci <- rankConcordance(dat, time, followup)[,"z"]

# set1 should achieve significance
p_enrich <- calculatePEnrichment(ci, genesets["set1",,drop=FALSE], cpus=1)
p_enrich

# set2 should not
p_enrich <- calculatePEnrichment(ci, genesets["set2",,drop=FALSE], cpus=1)
p_enrich