Description Usage Arguments Details Value Note References See Also Examples
Survival analysis and variable selection on microarray data.
This is a multivariate technique to select a small number
of relevant variables (typically genes) to perform survival
analysis on microarray data. This function performs the
training phase. It repeatedly calls bic.surv
from the
BMA
package until all variables are exhausted. The
variables in the dataset are assumed to be pre-sorted by rank.
1 | iterateBMAsurv.train (x, surv.time, cens.vec, curr.mat, stopVar=0, nextVar, nbest=10, maxNvar=25, maxIter=200000, thresProbne0=1, verbose = FALSE, suff.string="")
|
x |
Data matrix where columns are variables and rows are observations. The variables (columns) are assumed to be sorted using a univariate measure. In the case of gene expression data, the columns (variables) represent genes, while the rows (observations) represent samples. |
surv.time |
Vector of survival times for the patient samples. Survival times are assumed to be presented in uniform format (e.g., months or days), and the length of this vector should be equal to the number of rows in x. |
cens.vec |
Vector of censor data for the patient samples. In general, 0 = censored and 1 = uncensored. The length of this vector should equal the number of rows in x and the number of elements in surv.time. |
curr.mat |
Matrix of independent variables in the active |
stopVar |
0 to continue iterations, 1 to stop iterations (default 0) |
nextVar |
Integer placeholder indicating the next variable to be brought
into the active |
nbest |
A number specifying the number of models of each size
returned to |
maxNvar |
A number indicating the maximum number of variables used in
each iteration of |
maxIter |
A number indicating the maximum iterations of |
thresProbne0 |
A number specifying the threshold for the posterior
probability that each variable (gene) is non-zero (in
percent). Variables (genes) with such posterior
probability less than this threshold are dropped in
the iterative application of |
verbose |
A boolean variable indicating whether or not to print interim information to the console. The default is FALSE. |
suff.string |
A string for writing to file. |
The training phase consists of first ordering all the variables
(genes) by a univariate measure such as Cox Proportional Hazards
Regression, and then iteratively applying the bic.surv
algorithm
from the BMA
package. In the first application of
the bic.surv
algorithm, the top maxNvar
univariate
ranked genes are used. After each application of the bic.surv
algorithm, the genes with probne0
< thresProbne0
are dropped, and the next univariate ordered genes are added
to the active bic.surv
window.
On the last iteration of bic.surv
, four items are returned:
curr.mat |
A vector containing the names of the variables (genes)
from the final iteration of |
.
stopVar |
The ending value of stopVar after all iterations. |
nextVar |
The ending value of nextVar after all iterations. |
|
An object of class
|
The BMA
package is required.
Annest, A., Yeung, K.Y., Bumgarner, R.E., and Raftery, A.E. (2008). Iterative Bayesian Model Averaging for Survival Analysis. Manuscript in Progress.
Raftery, A.E. (1995). Bayesian model selection in social research (with Discussion). Sociological Methodology 1995 (Peter V. Marsden, ed.), pp. 111-196, Cambridge, Mass.: Blackwells.
Volinsky, C., Madigan, D., Raftery, A., and Kronmal, R. (1997) Bayesian Model Averaging in Proprtional Hazard Models: Assessing the Risk of a Stroke. Applied Statistics 46: 433-448.
Yeung, K.Y., Bumgarner, R.E. and Raftery, A.E. (2005) Bayesian Model Averaging: Development of an improved multi-class, gene selection and classification tool for microarray data. Bioinformatics 21: 2394-2402.
iterateBMAsurv.train.wrapper
,
iterateBMAsurv.train.predict.assess
,
singleGeneCoxph
,
predictBicSurv
,
trainData
,
trainSurv
,
trainCens
,
testData
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 | library(BMA)
library(iterativeBMAsurv)
data(trainData)
data(trainSurv)
data(trainCens)
data(testData)
## Training data should be pre-sorted before beginning
## Initialize the matrix for the active bic.surv window with variables 1 through maxNvar
maxNvar <- 25
curr.mat <- trainData[, 1:maxNvar]
nextVar <- maxNvar + 1
## Training phase: select relevant genes using nbest=5 for fast computation
ret.bic.surv <- iterateBMAsurv.train (x=trainData, surv.time=trainSurv, cens.vec=trainCens, curr.mat, stopVar=0, nextVar, nbest=5, maxNvar=25)
# Apply bic.surv again using selected genes
ret.bma <- bic.surv (x=ret.bic.surv$curr.mat, surv.t=trainSurv, cens=trainCens, nbest=5, maxCol=(maxNvar+1))
## Get the matrix for genes with probne0 > 0
ret.gene.mat <- ret.bic.surv$curr.mat[ret.bma$probne0 > 0]
## Get the gene names from ret.gene.mat
selected.genes <- dimnames(ret.gene.mat)[[2]]
## Show the posterior probabilities of selected models
ret.bma$postprob
## Get the subset of test data with the genes from the last iteration of
## 'bic.surv'
curr.test.dat <- testData[, selected.genes]
## Compute the predicted risk scores for the test samples
y.pred.test <- apply (curr.test.dat, 1, predictBicSurv, postprob.vec=ret.bma$postprob, mle.mat=ret.bma$mle)
|
Loading required package: BMA
Loading required package: survival
Loading required package: leaps
Loading required package: robustbase
Attaching package: 'robustbase'
The following object is masked from 'package:survival':
heart
Loading required package: inline
Loading required package: rrcov
Scalable Robust Estimators with High Breakdown Point (version 1.4-4)
Loading required package: splines
17: Explored up to variable # 100
Iterate bic.surv is done!
Selected genes:
[1] "X31687" "X33840" "X31242" "X16948" "X31471" "X17154" "X28531" "X19241"
[9] "X26146" "X17804" "X27332" "X17241" "X32212" "X29911" "X33558" "X33013"
[17] "X27884" "X33706" "X16817" "X31968" "X30209" "X29650" "X25054" "X16988"
[25] "X32904"
Posterior probabilities of selected genes:
[1] 100.0 47.5 47.3 2.4 38.5 28.5 40.1 96.7 2.8 1.7 0.0 59.9
[13] 0.0 0.0 10.0 0.0 2.5 58.3 2.1 98.8 28.4 7.1 95.1 0.0
[25] 100.0
[1] 0.075782322 0.068183539 0.062240254 0.056227073 0.045761712 0.044794588
[7] 0.043328132 0.042831731 0.039567629 0.039285627 0.038997242 0.034867824
[13] 0.032225236 0.030210326 0.026904418 0.025508701 0.025052995 0.024869256
[19] 0.021711946 0.021061750 0.020689119 0.020114454 0.017345536 0.017179713
[25] 0.017104052 0.015294500 0.014059561 0.014050900 0.012658966 0.010182444
[31] 0.008768581 0.007844758 0.007014883 0.006609877 0.006555310 0.005115046
There were 50 or more warnings (use warnings() to see the first 50)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.