Description Usage Arguments Details Value Author(s) References See Also Examples

View source: R/sparsePC.spikeslab.R

Variable selection for the multiclass gene prediction problem.

1 2 3 | ```
sparsePC.spikeslab(x = NULL, y = NULL, n.rep = 10,
n.iter1 = 150, n.iter2 = 100, n.prcmp = 5, max.genes = 100,
ntree = 1000, nodesize = 1, verbose = TRUE, ...)
``` |

`x` |
x matrix of gene expressions. |

`y` |
Class labels. |

`n.rep` |
Number of Monte Carlo replicates. |

`n.iter1` |
Number of burn-in Gibbs sampled values (i.e., discarded values). |

`n.iter2` |
Number of Gibbs sampled values, following burn-in. |

`n.prcmp` |
Number of principal components. |

`max.genes` |
Maximum number of genes in final model. |

`ntree` |
Number of trees used by random forests. |

`nodesize` |
Nodesize of trees. |

`verbose` |
If TRUE, verbose output is sent to the terminal. |

`...` |
Further arguments passed to or from other methods. |

Multiclass prediction using a hybrid combination of spike and slab
linear regression and random forest multiclass prediction (Ishwaran
and Rao, 2009). A pseudo y-vector of response values is calculated
using each of the top `n.prcmp`

principal components of the
x-gene expression matrix. The generalized elastic net obtained from
using spike and slab regression is used to select genes; one
regression fit is used for each of the pseduo y-response vectors. The
final combined set of genes are passed to random forests and used to
construct a multiclass forest predictor. This procedure is repeated
`n.rep`

times with each Monte Carlo replicate based on balanced
cross-validation with 2/3rds of the data used for training and 1/3rd
used for testing.

—> Miscellanea:

Test set error is only computed when `n.rep`

is larger than 1.
If `n.rep`

=1 the full data is used without any cross-validation.

Invisibly, the final set of selected genes as well as the complete set
of genes selected over the `n.rep`

Monte Carlo replications. The
random forest classifier is also returned.

The misclassification error rate, error rate for each class, and other summary information are output to the terminal.

Hemant Ishwaran ([email protected])

J. Sunil Rao ([email protected])

Udaya B. Kogalur ([email protected])

Ishwaran H. and Rao J.S. (2009). Generalized ridge regression: geometry and computational solutions when p is larger than n.

1 2 3 4 5 6 7 8 9 10 11 | ```
## Not run:
#------------------------------------------------------------
# Example 1: leukemia data
#------------------------------------------------------------
data(leukemia, package = "spikeslab")
sparsePC.out <- sparsePC(x = leukemia[, -1], y = leukemia[, 1], n.rep = 3)
rf.obj <- sparsePC.out$rf.obj
varImpPlot(rf.obj)
## End(Not run)
``` |

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.