CrossValidate package provides generic tools for performing
cross-validation on classification methods in the context of high-throughput
data sets such as those produced by gene expression profiling. In order to
use a classifier with this implementation of cross-validation, you must first
prepare a pair of functions (one for learning models from training data, and
one for making predictions on test data). These functions, along with any
required meta-parameters, are used to create an object of the
Modeler-class. That object is then passed to the
CrossValidate function along with the full training data set. The
full data set is then repeatedly split into its own training and test sets;
you can specify the fraction to be used for training and the number of
iterations. The result is a detailed look at the accuracy, sensitivity,
specificity, and positive and negative predictive value of the model, as
estimated by cross-validation.
Kevin R. Coombes [email protected]
Braga-Neto U, Dougherty ER.
Is cross-validation valid for small-sample microarray classification?
Bioinformatics, 2004; 20:374–380.
Jiang W, Varma S, Simon R.
Calculating confidence intervals for prediction error in microarray classification using resampling.
Stat Appl Genet Mol Biol. 2008; 7:Article8.
Fu LM, Youn ES.
Improving reliability of gene selection from microarray functional genomics data.
IEEE Trans Inf Technol Biomed. 2003; 7:191–6.
Man MZ, Dyson G, Johnson K, Liao B.
Evaluating methods for classifying expression data.
J Biopharm Stat. 2004; 14:1065–84.
Fu WJ, Carroll RJ, Wang S.
Estimating misclassification error with small samples via bootstrap cross-validation.
Bioinformatics, 2005; 21:1979–86.
Ancona N, Maglietta R, Piepoli A, D'Addabbo A, Cotugno R, Savino M,
Liuni S, Carella M, Pesole G, Perri F.
On the statistical assessment of classifiers using DNA microarray data.
BMC Bioinformatics, 2006; 7:387.
Lecocke M, Hess K.
An empirical study of univariate and genetic algorithm-based feature selection in binary classification with microarray data.
Cancer Inform, 2007; 2:313–27.
Mistakes in validating the accuracy of a prediction classifier in high-dimensional but small-sample microarray data.
Stat Methods Med Res, 2008; 17:635–42.
Modeler-package contains numerous classification
methods that have been adapted to work within this general
cross-validation framework, including: K nearest neighbors
learnKNN), recursive partitioning and regression trees
learnRPART), random forests (
neural networks (
learnNNET), support vector machines
learnSVM), compound covariate predictors
learnCCP), and the TailRank test
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.