Estimates optimal number of biomarkers at a given error tolerance level for various classification rules
Description
Using interactive control panel (see rpanel
) and 3D
realtime rendering system (rgl
), this package provides a
user friendly GUI for estimating the minimum number of biomarkers
(variables) needed to achieve a given level of accuracy for twogroup
classification problems based on microarray data.
Usage
1 2 3 4 5 6 7  optimiseBiomarker (error,
errorTol = 0.05,
method = "RF", nTrain = 100,
sdB = 1.5,
sdW = 1,
foldAvg = 2.88,
nRep = 3)

Arguments
error 
The database of classification errors. See

errorTol 
Error tolerance limit. 
method 
Classification method. Can be one of 
nTrain 
Training set size, i.e., the total number of biological samples in group 1 and group 2. 
sdB 
Biological variation (σ_b) of data in log (base 2) scale. 
sdW 
Experimental (technical) variation (σ_e) of data in log (base 2) scale. 
foldAvg 
Average fold change of the biomarkers. 
nRep 
Number of technical replications. 
Details
The function optimiseBiomarker
is a user friendly GUI for
interrogating the database of leaveoneout crossvalidation errors,
errorDbase
, to estimate optimal number of biomarkers for
microarray based classifications. The database is built on the basis of
simulated data using the classificationError
function. The
function simData
is used for simulating microarray data
for various combinations of factors such as the number of biomarkers,
training set size, biological variation, experimental variation, fold
change, replication, and correlation.
Author(s)
Mizanur Khondoker, Till Bachmann, Peter Ghazal
Maintainer: Mizanur Khondoker mizanur.khondoker@gmail.com.
References
Khondoker, M. R., Till T. Bachmann, T. T., Mewissen, M., Dickinson, P. et al.(2010). Multifactorial analysis of class prediction error: estimating optimal number of biomarkers for various classification rules. Journal of Bioinformatics and Computational Biology, 8, 945965.
Breiman, L. (2001). Random Forests, Machine Learning 45(1), 5–32.
Chang, ChihChung and Lin, ChihJen: LIBSVM: a library for Support Vector Machines, http://www.csie.ntu.edu.tw/~cjlin/libsvm.
Ripley, B. D. (1996). Pattern Recognition and Neural Networks.Cambridge: Cambridge University Press.
Efron, B. and Tibshirani, R. (1997). Improvements on CrossValidation: The .632+ Bootstrap Estimator. Journal of the American Statistical Association 92(438), 548–560.
Bowman, A., Crawford, E., Alexander, G. and Bowman, R. W. (2007). rpanel: Simple interactive controls for R functions using the tcltk package. Journal of Statistical Software 17(9).
See Also
simData
classificationError
Examples
1 2 3 4  if(interactive()){
data(errorDbase)
optimiseBiomarker(error=errorDbase)
}
