flexdaCMA: Flexible Discriminant Analysis
In CMA: Synthesis of microarray-based classification

Description Usage Arguments Value Note Author(s) References See Also Examples

This method is experimental.

It is easy to show that, after appropriate scaling of the predictor matrix X, Fisher's Linear Discriminant Analysis is equivalent to Discriminant Analysis in the space of the fitted values from the linear regression of the nlearn x K indicator matrix of the class labels on X. This gives rise to 'nonlinear discrimant analysis' methods that expand X in a suitable, more flexible basis. In order to avoid overfitting, penalization is used. In the implemented version, the linear model is replaced by a generalized additive one, using the package mgcv.

For S4 method information, s. flexdaCMA-methods.

1	flexdaCMA(X, y, f, learnind, comp = 1, plot = FALSE, models=FALSE, ...)

`X`	Gene expression data. Can be one of the following: A `matrix`. Rows correspond to observations, columns to variables. A `data.frame`, when `f` is not missing (s. below). An object of class `ExpressionSet`.
`y`	Class labels. Can be one of the following: A `numeric` vector. A `factor`. A `character` if `X` is an `ExpressionSet` that specifies the phenotype variable. `missing`, if `X` is a `data.frame` and a proper formula `f` is provided. WARNING: The class labels will be re-coded to range from `0` to `K-1`, where `K` is the total number of different classes in the learning set.
`f`	A two-sided formula, if `X` is a `data.frame`. The left part correspond to class labels, the right to variables.
`learnind`	An index vector specifying the observations that belong to the learning set. May be `missing`; in that case, the learning set consists of all observations and predictions are made on the learning set.
`comp`	Number of discriminant coordinates (projections) to compute. Default is one, must be smaller than or equal to `K-1`, where `K` is the number of classes.
`plot`	Should the projections onto the space spanned by the optimal projection directions be plotted ? Default is `FALSE`.
`models`	a logical value indicating whether the model object shall be returned
`...`	Further arguments passed to the function `gam` from the package `mgcv`.

An object of class cloutput.

Excessive variable selection has usually to performed before flexdaCMA can be applied in the p > n setting. Recall that the original predictor dimension is even enlarged, therefore, it should be applied only with very few variables.

Martin Slawski ms@cs.uni-sb.de

Anne-Laure Boulesteix boulesteix@ibe.med.uni-muenchen.de

Ripley, B.D. (1996)

Pattern Recognition and Neural Networks.

Cambridge University Press

compBoostCMA, dldaCMA, ElasticNetCMA, fdaCMA, gbmCMA, knnCMA, ldaCMA, LassoCMA, nnetCMA, pknnCMA, plrCMA, pls_ldaCMA, pls_lrCMA, pls_rfCMA, pnnCMA, qdaCMA, rfCMA, scdaCMA, shrinkldaCMA, svmCMA

### load Golub AML/ALL data
data(golub)
### extract class labels
golubY <- golub[,1]
### extract gene expression from first 5 genes
golubX <- as.matrix(golub[,2:6])
### select learningset
ratio <- 2/3
set.seed(111)
learnind <- sample(length(golubY), size=floor(ratio*length(golubY)))
### run flexible Discriminant Analysis
result <- flexdaCMA(X=golubX, y=golubY, learnind=learnind, comp = 1)
### show results
show(result)
ftable(result)
plot(result)