Function to Classify Microarray Data using Penalized Discriminant Methods

Share:

Description

This function is used to classify microarray data. Since the underlying model fit is based on penalized discriminant methods, there is no need for a pre-filtering step to reduce the number of genes.

Usage

1
2
pdmClass(formula , method = c("pls", "pcr", "ridge"), keep.fitted =
TRUE,  ...)

Arguments

formula

A symbolic description of the model to be fit. Details given below.

method

One of "pls", "pcr", "ridge", corresponding to partial least squares, principal components regression and ridge regression.

keep.fitted

Boolean. Should the fitted values be kept? Default is TRUE, as this is necessary for the plotting and predict functions.

...

Additional parameters to pass to method or fda. See fda for more information.

Details

The formula interface is identical to all other formula calls in R, namely Y ~ X, where Y is a numeric vector of class assignments and X is a matrix or data.frame containing the gene expression values. Note that unlike most microarray analyses, in this instance the columns of X are genes and rows are samples, so most calls will require something similar to Y ~ t(X).

Value

an object of class "fda". Use predict to extract discriminant variables, posterior probabilities or predicted class memberships. Other extractor functions are coef, and plot.

The object has the following components:

percent.explained

the percent between-group variance explained by each dimension (relative to the total explained.)

values

optimal scaling regresssion sum-of-squares for each dimension (see reference). The usual discriminant analysis eigenvalues are given by values / (1-values), which are used to define percent.explained.

means

class means in the discriminant space. These are also scaled versions of the final theta's or class scores, and can be used in a subsequent call to fda (this only makes sense if some columns of theta are omitted—see the references).

theta.mod

(internal) a class scoring matrix which allows predict to work properly.

dimension

dimension of discriminant space.

prior

class proportions for the training data.

fit

fit object returned by method.

call

the call that created this object (allowing it to be update-able)

confusion

A 'confusion' matrix that shows how well the classifier works using the training data.

Author(s)

James W. MacDonald and Debashis Ghosh, based on fda in the mda package of Trevor Hastie and Robert Tibshirani, which was ported to R by Kurt Hornik, Brian D. Ripley, and Friedrich Leisch.

References

http://www.sph.umich.edu/~ghoshd/COMPBIO/POPTSCORE

"Flexible Disriminant Analysis by Optimal Scoring" by Hastie, Tibshirani and Buja, 1994, JASA, 1255-1270.

"Penalized Discriminant Analysis" by Hastie, Buja and Tibshirani, Annals of Statistics, 1995 (in press).

Examples

1
2
3
4
5
library(fibroEset)
data(fibroEset)
y <- as.factor(pData(fibroEset)[,2])
x <- t(exprs(fibroEset))
pdmClass(y ~ x)