printTopGenes: Write a training set including the top-ranked G variables...

Description Usage Arguments Details Value References See Also Examples

View source: R/singleGeneCoxph.R

Description

This function takes a matrix of rank-ordered variables and writes a training set containing the top G variables in the matrix to file.

Usage

1
printTopGenes (retMatrix, numGlist=c(10, 30, 50, 100, 500, 1000, ncol(trainData)), trainData, myPrefix="sorted_topCoxphGenes_")

Arguments

retMatrix

A three-column matrix where the first column contains the sorted variable names (the top log-ranked variable appears first), the second column contains the original index of the variables, and the third column contains the variable ranking from 1 to ncol(trainData).

numGlist

A list of values for the desired number of top-ranked variables to be written to file. A separate file will be written for each number G in the list, containing genes 1:G (default = c(10, 30, 50, 100, 500, 1000, ncol(trainData))).

trainData

Data matrix where columns are variables and rows are observations. In the case of gene expression data, the columns (variables) represent genes, while the rows (observations) represent patient samples.

myPrefix

A string prefix for the filename (default = 'sorted\_topCoxphGenes\_').

Details

This function is called by iterateBMAsurv.train.predict.assess. It is meant to be used in conjunction with singleGeneCoxph, as the retMatrix argument is returned by singleGeneCoxph.

Value

A file or files consisting of the training data sorted in descending order by the top-ranked G variables (one file for each G in numGList).

References

Annest, A., Yeung, K.Y., Bumgarner, R.E., and Raftery, A.E. (2008). Iterative Bayesian Model Averaging for Survival Analysis. Manuscript in Progress.

Raftery, A.E. (1995). Bayesian model selection in social research (with Discussion). Sociological Methodology 1995 (Peter V. Marsden, ed.), pp. 111-196, Cambridge, Mass.: Blackwells.

Volinsky, C., Madigan, D., Raftery, A., and Kronmal, R. (1997) Bayesian Model Averaging in Proprtional Hazard Models: Assessing the Risk of a Stroke. Applied Statistics 46: 433-448.

Yeung, K.Y., Bumgarner, R.E. and Raftery, A.E. (2005) Bayesian Model Averaging: Development of an improved multi-class, gene selection and classification tool for microarray data. Bioinformatics 21: 2394-2402.

See Also

iterateBMAsurv.train.predict.assess, singleGeneCoxph, trainData, trainSurv, trainCens,

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
library(BMA)
library(iterativeBMAsurv)
data(trainData)
data(trainSurv)
data(trainCens)

## Start by ranking and sorting the genes; in this case we use the Cox Proportional Hazards Model
sorted.genes <- singleGeneCoxph(trainData, trainSurv, trainCens)

## Write top 100 genes to file
sorted.top.genes <- printTopGenes(retMatrix=sorted.genes, 100, trainData)

## The file, 'sorted_topCoxphGenes_100', is now in the working R directory.

iterativeBMAsurv documentation built on Nov. 8, 2020, 11:10 p.m.