Between-groups sum-of-squares to within-groups sum-of-squares ratio

Share:

Description

This is a univariate technique to select relevant genes in classification of microarray data. In classifying samples of microarray data, this ratio is computed for each gene. A large between-groups to within-groups sum-of-squares ratio indicates a potentially relevant gene.

Usage

1
BssWssFast (X, givenClassArr, numClass = 2)

Arguments

X

data matrix where columns are variables and rows are observations. In the case of gene expression data, the columns (variables) represent genes, while the rows (observations) represent samples or experiments.

givenClassArr

class vector for the observations (samples or experiments). Class numbers are assumed to start from 0, and the length of this class vector should be equal to the number of rows in X. In the case of 2-class data, we expect the class vector consists of zero's and one's.

numClass

number of classes. The default is 2.

Details

This function is called by iterateBMAglm.2class.

Value

A list of 2 elements are returned:

x

A vector containing the BSS/WSS ratios in descending order.

ix

A vector containing the indices corresponding to the sorted ratios.

References

Dudoit, S., Fridlyand, J. and Speed, T.P. (2002) Comparison of discrimination methods for the classification of tumors using gene expression data. Journal of the American Statistical Association 97: 77-87.

Yeung, K.Y., Bumgarner, R.E. and Raftery, A.E. (2005) Bayesian Model Averaging: Development of an improved multi-class, gene selection and classification tool for microarray data. Bioinformatics 21: 2394-2402.

See Also

iterateBMAglm.train, trainData, trainClass

Examples

1
2
3
4
data(trainData)
data(trainClass)

ret.bsswss <- BssWssFast (X=t(exprs(trainData)), givenClassArr=trainClass, numClass = 2)