cv.vda.le: Choose the optimal pair of lambdas, lambda_1 and lambda_2

Description Usage Arguments Details Value Author(s) References See Also Examples

View source: R/cv.vda.le.R

Description

Use k-fold validation to choose the optmial values for the tuning parameters λ_1 and λ_2 to be used in Multicategory Vertex Discriminant Analysis (vda.le).

Usage

1
cv.vda.le(x, y, kfold, lam.vec.1, lam.vec.2)

Arguments

x

n x p matrix or data frame containing the cases for each feature. The rows correspond to cases and the columns to the features. Intercept column is not included in this.

y

n x 1 vector representing the outcome variable. Each element denotes which one of the k classes that case belongs to.

kfold

The number of folds to use for the k-fold validation for each set of λ_1 and λ_2

lam.vec.1

A vector containing the set of all values of λ_1, from which VDA will be conducted. To use only Euclidean penalization, set lam.vec.2=0.

lam.vec.2

A vector containing the set of all values of λ_2, from which VDA will be conducted. vda.le is relatively insensitive to lambda values, so it is recommended that a vector of few values is used. The default value is 0.01. To use only Lasso penalization, set lam.vec.1=0.

Details

For each pair of (λ_1,λ_2), k-fold cross-validation will be conducted and the corresponding average testing error over the k folds will be recorded. λ_1 represents the parameter for the lasso penalization, while λ_2 represents the parameter for the group euclidean penalization. To use only Lasso penalization, set lam.vec.2=0. To use only Euclidean penalization, set lam.vec.1=0. The optimal pair is considered the pair of values that give the smallest testing error over the cross validation.

To view a plot of the cross validation errors across lambda values, see plot.cv.vda.le.

Value

kfold

The number of folds used in k-fold cross validation

lam.vec.1

The user supplied vector of λ_1 values

lam.vec.2

The user supplied vector of λ_2 values

error.cv

A matrix of average testing errors. The rows correspond to λ_1 values and the columns correspond to λ_2 values.

lam.opt

The pair of λ_1 and λ_2 values that return the lowest testing error across k-fold cross validation.

Author(s)

Edward Grant, Xia Li, Kenneth Lange, Tong Tong Wu

Maintainer: Edward Grant edward.m.grant@gmail.com

References

Wu, T.T. and Lange, K. (2010) Multicategory Vertex Discriminant Analysis for High-Dimensional Data. Annals of Applied Statistics, Volume 4, No 4, 1698-1721.

Lange, K. and Wu, T.T. (2008) An MM Algorithm for Multicategory Vertex Discriminant Analysis. Journal of Computational and Graphical Statistics, Volume 17, No 3, 527-544.

See Also

vda.le.

plot.cv.vda.le.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
### load zoo data
### column 1 is name, columns 2:17 are features, column 18 is class
data(zoo)

### feature matrix 
x <- zoo[,2:17]

### class vector
y <- zoo[,18]

### lambda vector
lam1 <- (1:5)/100
lam2 <- (1:5)/100

### Searching for the best pair, using both lasso and euclidean penalizations
cv <- cv.vda.le(x, y, kfold = 3, lam.vec.1 = exp(1:5)/10000, lam.vec.2 = (1:5)/100)
plot(cv)
outLE <- vda.le(x,y,cv$lam.opt[1],cv$lam.opt[2])

### To search for the best pair, using ONLY lasso penalization, set lambda2=0 (remove comments)
#cvlasso <- cv.vda.le(x, y, kfold = 3, lam.vec.1 = exp(1:10)/1000, lam.vec.2 = 0)
#plot(cvlasso)
#cvlasso$lam.opt

### To search for the best pair, using ONLY euclidean penalization, set lambda1=0 (remove comments)
#cveuclidian <- cv.vda.le(x, y, kfold = 3, lam.vec.1 = 0, lam.vec.2 = exp(1:10)/1000)
#plot(cveuclidian)
#cveuclidian$lam.opt

### Predict five cases based on vda.le (Lasso and Euclidean penalties)
fivecases <- matrix(0,5,16)
fivecases[1,] <- c(1,0,0,1,0,0,0,1,1,1,0,0,4,0,1,0)
fivecases[2,] <- c(1,0,0,1,0,0,1,1,1,1,0,0,4,1,0,1)
fivecases[3,] <- c(0,1,1,0,1,0,0,0,1,1,0,0,2,1,1,0)
fivecases[4,] <- c(0,0,1,0,0,1,1,1,1,0,0,1,0,1,0,0)
fivecases[5,] <- c(0,0,1,0,0,0,1,0,0,0,0,0,0,0,0,0)
predict(outLE, fivecases)

VDA documentation built on May 29, 2017, 6:32 p.m.