Description Usage Arguments Details Value References See Also Examples
This is the main user interface to the saps package, and is usually the only function needed.
1 2 3 
candidateGeneSets 
A matrix with at least one row, where each row represents
a gene set, and the column values are gene identifiers. The row names should contain
unique names for the gene sets. The column values may contain 
dataSet 
A matrix, where the column names are gene identifiers
(in the same format as the values in 
survivalTimes 
A vector of survival times. The length must equal the number of
rows (i.e. patients) in 
followup 
A vector of 0 or 1 values, indicating whether the patient was
lost to followup (0) or not (1). The length must equal the number of rows
(i.e. patients) in 
random.samples 
An integer that specifies how many random gene sets to sample when computing P_random. Defaults to 1000. 
cpus 
An integer that specifies the number of cpus/cores to be used when calculating P_enrichment. If greater than 1 (the default), the snowfall package must be installed or an error will occur. 
gsea.perm 
The number of permutations to be used when calculating
p_enrich. This is passed to the 
compute_qvalue 
A boolean indicating whether to include calculation
of the saps q_value. Setting this to 
qvalue.samples 
An integer that specifies how many random gene sets to sample when computing the saps q_value. Defaults to 1000. 
verbose 
A boolean indicating whether to display status messages during
computation. Defaults to 
saps provides a robust method for identifying biologically significant gene sets associated with patient survival. Three basic statistics are computed. First, patients are clustered into two survival groups based on differential expression of a candidate gene set. p_pure is calculated as the probability of no survival difference between the two groups.
Next, the same procedure is applied to randomly generated gene sets, and p_random is calculated as the proportion achieving a p_pure as significant as the candidate gene set. Finally, a preranked Gene Set Enrichment Analysis (GSEA) is performed by ranking all genes by concordance index, and p_enrich is computed to indicate the degree to which the candidate gene set is enriched for genes with univariate prognostic significance.
A saps_score is calculated to summarize the three statistics, and optionally a saps_qvalue is computed to estimate the significance of the saps_score by calculating the saps_score for random gene sets.
The function returns a list with the following elements:
rankedGenes 
Vector of concordance index zscores for the genes in

geneset.count 
The number of gene sets analyzed. 
genesets 
A list of genesets (see below). 
saps_table 
A dataframe summarizing the adjusted and unadjusted
saps statistics for each geneset analyzed. The dataframe contains
the following columns: 
genesets
is in turn a list with the following elements:
name 
The name of the geneset. 
size 
The number of genes in the geneset. 
genes 
Vector of gene labels for this geneset. 
saps_unadjusted 
Vector with elements 
saps_adjusted 
Vector with elements 
cluster 
Vector of assigned cluster (1 or 2) for each patient using this candidate geneset. 
random_p_pures 
Vector of p_pure values for each random geneset generated during the computation of p_random. 
random_saps_scores 
Vector of saps_score values for each random geneset generated during the computation of saps_qvalue. 
direction 
Direction (1 or 1) of the enrichment association for this geneset. 
Beck AH, Knoblauch NW, Hefti MM, Kaplan J, Schnitt SJ, et al. (2013) Significance Analysis of Prognostic Signatures. PLoS Comput Biol 9(1): e1002875.doi:10.1371/journal.pcbi.1002875
survdiff
concordance.index
runGSA
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32  # 25 patients, none lost to followup
followup < rep(1, 25)
# first 5 patients have good survival (in days)
time < c(25, 27, 24, 21, 26, sample(1:3, 20, TRUE))*365
# create data for 100 genes, 25 patients
dat < matrix(rnorm(25*100), nrow=25, ncol=100)
colnames(dat) < as.character(1:100)
# create two random genesets of 5 genes each
set1 < sample(colnames(dat), 5)
set2 < sample(colnames(dat), 5)
genesets < rbind(set1, set2)
# compute saps
results < saps(genesets, dat, time, followup, random.samples=100)
# check results
saps_table < results$saps_table
saps_table[1:7]
# increase expression levels for set1 for first 5 patients
dat[1:5, set1] < dat[1:5, set1]+10
# run again, should get significant values for set1
results < saps(genesets, dat, time, followup, random.samples=100)
# check results
saps_table < results$saps_table
saps_table[1:7]

Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.