sda: Sparse discriminant analysis

Description Usage Arguments Details Value Author(s) References See Also Examples

Description

Performs sparse linear discriminant analysis. Using an alternating minimization algorithm to minimize the SDA criterion.

Usage

1
2
3
4
5
sda(x, ...)

## Default S3 method:
sda(x, y, lambda = 1e-6, stop = -p, maxIte = 100,
    Q = K-1, trace = FALSE, tol = 1e-6, ...)

Arguments

x

A matrix of the training data with observations down the rows and variables in the columns.

y

A matrix initializing the dummy variables representing the groups.

lambda

The weight on the L2-norm for elastic net regression. Default: 1e-6.

stop

If STOP is negative, its absolute value corresponds to the desired number of variables. If STOP is positive, it corresponds to an upper bound on the L1-norm of the b coefficients. There is a one to one correspondence between stop and t. The default is -p (-the number of variables).

maxIte

Maximum number of iterations. Default: 100.

Q

Number of components. Maximum and default is K-1 (the number of classes less one).

trace

If TRUE, prints out its progress. Default: FALSE.

tol

Tolerance for the stopping criterion (change in RSS). Default is 1e-6.

...

additional arguments

Details

The function finds sparse directions for linear classification.

Value

Returns a list with the following attributes:

beta

The loadings of the sparse discriminative directions.

theta

The optimal scores.

rss

A vector of the Residual Sum of Squares at each iteration.

varNames

Names on included variables

.

Author(s)

Line Clemmensen, modified by Trevor Hastie

References

Clemmensen, L., Hastie, T. Witten, D. and Ersboell, K. (2011) "Sparse discriminant analysis", Technometrics, To appear.

See Also

normalize, normalizetest, smda

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
## load data
data(penicilliumYES)

X <- penicilliumYES$X
Y <- penicilliumYES$Y
colnames(Y) <- c("P. Melanoconidium",
                 "P. Polonicum",
                 "P. Venetum")

## test samples
Iout<-c(3,6,9,12)
Iout<-c(Iout,Iout+12,Iout+24)

## training data
Xtr<-X[-Iout,]
k<-3
n<-dim(Xtr)[1]

## Normalize data
Xc<-normalize(Xtr)
Xn<-Xc$Xc
p<-dim(Xn)[2]

## Perform SDA with one non-zero loading for each discriminative
## direction with Y as matrix input
out <- sda(Xn, Y,
           lambda = 1e-6,
           stop = -1,
           maxIte = 25,
           trace = TRUE)

## predict training samples
train <- predict(out, Xn)

## testing
Xtst<-X[Iout,]
Xtst<-normalizetest(Xtst,Xc)

test <- predict(out, Xtst)
print(test$class)

## Factor Y as input
Yvec <- factor(rep(colnames(Y), each = 8))
out2 <- sda(Xn, Yvec,
            lambda = 1e-6,
            stop = -1,
            maxIte = 25,
            trace = TRUE)

Example output

ite:  1  ridge cost:  15.81839  |b|_1:  1.060389 
ite:  2  ridge cost:  21.11437  |b|_1:  0.3416186 
ite:  3  ridge cost:  21.11437  |b|_1:  0.3416186 
ite:  1  ridge cost:  15.69303  |b|_1:  0.973995 
ite:  2  ridge cost:  15.69303  |b|_1:  0.973995 
final update, total ridge cost:  36.80739  |b|_1:  1.315614 
 [1] P. Melanoconidium P. Melanoconidium P. Melanoconidium P. Melanoconidium
 [5] P. Polonicum      P. Polonicum      P. Polonicum      P. Polonicum     
 [9] P. Venetum        P. Venetum        P. Venetum        P. Venetum       
Levels: P. Melanoconidium P. Polonicum P. Venetum
ite:  1  ridge cost:  9.097637  |b|_1:  1.967469 
ite:  2  ridge cost:  5.618445  |b|_1:  2.642181 
ite:  3  ridge cost:  5.618445  |b|_1:  2.642181 
ite:  1  ridge cost:  19.8993  |b|_1:  0.4774033 
ite:  2  ridge cost:  19.8993  |b|_1:  0.4774033 
final update, total ridge cost:  25.51774  |b|_1:  3.119584 

sparseLDA documentation built on May 2, 2019, 7:23 a.m.

Related to sda in sparseLDA...