Cross validation of high-throughput prediction algorithms
CrossValidate package provides generic tools for performing
cross-validation on classification methods in the context of high-throughput
data sets such as those produced by gene expression microarrays. In order to
use a classifier with this implementation of cross-validation, you must first
prepare a pair of functions (one for learning models from training data, and
one for making predictions on test data). These functions, along with any
required meta-parameters, are used to create an object of the
Modeler-class. That object is then passed to the
CrossValidate function along with the full training data set. The
full data set is then repeatedly split into its own training and test sets;
you can specify the fraction to be used for training and the number of
iterations. The result is a detailed look at the accuracy, sensitivity,
specificity, and positive and negative predictive value of the model, as
estimated by cross-validation.
Kevin R. Coombes email@example.com
Braga-Neto U, Dougherty ER.
Is cross-validation valid for small-sample microarray classification?
Bioinformatics, 2004; 20:374–380.
Jiang W, Varma S, Simon R.
Calculating confidence intervals for prediction error in microarray classification using resampling.
Stat Appl Genet Mol Biol. 2008; 7:Article8.
Fu LM, Youn ES.
Improving reliability of gene selection from microarray functional genomics data.
IEEE Trans Inf Technol Biomed. 2003; 7:191–6.
Man MZ, Dyson G, Johnson K, Liao B.
Evaluating methods for classifying expression data.
J Biopharm Stat. 2004; 14:1065–84.
Fu WJ, Carroll RJ, Wang S.
Estimating misclassification error with small samples via bootstrap cross-validation.
Bioinformatics, 2005; 21:1979–86.
Ancona N, Maglietta R, Piepoli A, D'Addabbo A, Cotugno R, Savino M,
Liuni S, Carella M, Pesole G, Perri F.
On the statistical assessment of classifiers using DNA microarray data.
BMC Bioinformatics, 2006; 7:387.
Lecocke M, Hess K.
An empirical study of univariate and genetic algorithm-based feature selection in binary classification with microarray data.
Cancer Inform, 2007; 2:313–27.
Mistakes in validating the accuracy of a prediction classifier in high-dimensional but small-sample microarray data.
Stat Methods Med Res, 2008; 17:635–42.
The following classification methods have been adapted to work within
the general cross-validation framework: K nearest neighbors
learnKNN), recursive partitioning and regression trees
Want to suggest features or report bugs for rdrr.io? Use the GitHub issue tracker.