geneReduction: Variable selection and cluster functions
In MCRestimate: Misclassification error estimation with cross-validation

Description Usage Arguments Details Value Author(s) See Also Examples

Different functions for a variable selection and clustering methods. These functions are mainly used for the function MCRestimate

identity(sample.gene.matrix,classfactor,...)
       varSel.highest.t.stat(sample.gene.matrix,classfactor,theParameter=NULL,var.numbers=500,...)

       varSel.highest.var(sample.gene.matrix,classfactor,theParameter=NULL,var.numbers=2000,...)

       varSel.AUC(sample.gene.matrix, classfactor, theParameter=NULL,var.numbers=200,...)
       cluster.kmeans.mean(sample.gene.matrix,classfactor,theParameter=NULL,number.clusters=500,...)

       varSel.removeManyNA(sample.gene.matrix,classfactor, theParameter=NULL, NAthreshold=0.25,...)
       varSel.impute.NA(sample.gene.matrix ,classfactor,theParameter=NULL,...)

`sample.gene.matrix`	a matrix in which the rows corresponds to genes and the colums corresponds to samples
`classfactor`	a factor containing the values that should be predicted
`theParameter`	Parameter that depends on the function. For 'cluster.kmeans.mean' either NULL or an output of the function `kmeans`. If it is NULL then `kmeans` will be used to form clusters of the genes. Otherwise the already existing clusters will be used. In both ways there will be a calculation of the metagene intensities afterwards. For the other functions either NULL or a logical vector which indicates for every gene if it should be left out from further analysis or not
`number.clusters`	parameter which specifies the number of clusters
`var.numbers`	some methods needs an argument which specifies how many variables should be taken
`NAthreshold`	integer- if the percentage of the NA is higher than this threshold the variable will be deleted
`...`	Further parameters

metagene.kmeans.mean performs a kmeans clustering with a number of clusters specified by 'number clusters' and takes the mean of each cluster. varSel.highest.var selects a number (specified by 'var.numbers') of variables with the highest variance. varSel.AUC chooses the most discriminating variables due to the AUC criterium (the library ROC is required).

Every function returns a list consisting of two arguments:

`matrix`	the result matrix of the variable reduction or the clustering
`parameter`	The parameter which are used to reproduce the algorithm, i.e. a vector which indicates for every gene if it will be left out from further analysis or not if a gene reduction is performed or the output of the function kmeans for the clustering algorithm.

Markus Ruschhaupt mailto:m.ruschhaupt@dkfz.de

MCRestimate

1 2	m <- matrix(c(rnorm(10,2,0.5),rnorm(10,4,0.5),rnorm(10,7,0.5),rnorm(10,2,0.5),rnorm(10,4,0.5),rnorm(10,2,0.5)),ncol=2) cluster.kmeans.mean(m ,number.clusters=3)