Robust Linear Discriminant Analysis by Projection Pursuit
Description
Performs robust linear discriminant analysis by the projectionpursuit approach 
proposed by Pires and Branco (2010)  and returns the results as an object of
class LdaPP
(aka constructor).
Usage
1 2 3 4 5 6 7 8 
Arguments
formula 
a formula of the form 
data 
an optional data frame (or similar: see

subset 
an optional vector used to select rows (observations) of the
data matrix 
na.action 
a function which indicates what should happen
when the data contain 
x 
a matrix or data frame containing the explanatory variables (training set). 
grouping 
grouping variable: a factor specifying the class for each observation. 
prior 
prior probabilities, default to the class proportions for the training set. 
tol 
tolerance 
method 
method 
optim 
wheather to perform the approximation using the Nelder and Mead simplex method
(see function 
trace 
whether to print intermediate results. Default is 
... 
arguments passed to or from other methods. 
Details
Currently the algorithm is implemented only for binary classification and in the following will be assumed that only two groups are present.
The PP algorithm searches for lowdimensional projections of higherdimensional
data where a projection index is maximized. Similar to the original Fisher's proposal
the squared standardized distance between the observations in the two groups is maximized.
Instead of the sample univariate mean and standard deviation (T,S)
robust
alternatives are used. These are selected through the argument method
and can be one of
 huber
the pair
(T,S)
are the robust Mestimates of location and scale mad
(T,S)
are the Median and the Median Absolute Deviation sest
the pair
(T,S)
are the robust Sestimates of location and scale class
(T,S)
are the mean and the standard deviation.
The first approximation A1 to the solution is obtained by investigating
a finite number of candidate directions, the unit vectors defined
by all pairs of points such that one belongs to the first group
and the other to the second group. The found solution is stored in the slots
raw.ldf
and raw.ldfconst
.
The second approximation A2 (optional) is performed by
a numerical optimization algorithm using A1 as initial solution.
The Nelder and Mead method implemented in the function optim
is applied.
Whether this refinement will be used is controlled by the argument optim
.
If optim=TRUE
the result of the optimization is stored into the slots
ldf
and ldfconst
. Otherwise these slots are set equal to
raw.ldf
and raw.ldfconst
.
Value
Returns an S4 object of class LdaPPclass
Warning
Still an experimental version! Only binary classification is supported.
Author(s)
Valentin Todorov valentin.todorov@chello.at and Ana Pires apires@math.ist.utl.pt
References
Pires, A. M. and A. Branco, J. (2010) Projectionpursuit approach to robust linear discriminant analysis Journal Multivariate Analysis, Academic Press, Inc., 101, 2464–2485.
See Also
Linda
, LdaClassic
Examples
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50  ##
## Function to plot a LDA separation line
##
lda.line < function(lda, ...)
{
ab < lda@ldf[1,]  lda@ldf[2,]
cc < lda@ldfconst[1]  lda@ldfconst[2]
abline(a=cc/ab[2], b=ab[1]/ab[2],...)
}
data(pottery)
x < pottery[,c("MG", "CA")]
grp < pottery$origin
col < c(3,4)
gcol < ifelse(grp == "Attic", col[1], col[2])
gpch < ifelse(grp == "Attic", 16, 1)
##
## Reproduce Fig. 2. from Pires and branco (2010)
##
plot(CA~MG, data=pottery, col=gcol, pch=gpch)
ppc < LdaPP(x, grp, method="class", optim=TRUE)
lda.line(ppc, col=1, lwd=2, lty=1)
pph < LdaPP(x, grp, method="huber",optim=TRUE)
lda.line(pph, col=3, lty=3)
pps < LdaPP(x, grp, method="sest", optim=TRUE)
lda.line(pps, col=4, lty=4)
ppm < LdaPP(x, grp, method="mad", optim=TRUE)
lda.line(ppm, col=5, lty=5)
rlda < Linda(x, grp, method="mcd")
lda.line(rlda, col=6, lty=1)
fsa < Linda(x, grp, method="fsa")
lda.line(fsa, col=8, lty=6)
## Use the formula interface:
##
LdaPP(origin~MG+CA, data=pottery) ## use the same two predictors
LdaPP(origin~., data=pottery) ## use all predictor variables
##
## Predict method
data(pottery)
fit < LdaPP(origin~., data = pottery)
predict(fit)
