Description Usage Arguments Details Value Author(s) References See Also Examples
Use k-fold validation to choose the optmial values for the tuning parameters λ_1 and λ_2 to be used in Multicategory Vertex Discriminant Analysis (vda.le
).
1 | cv.vda.le(x, y, kfold, lam.vec.1, lam.vec.2)
|
x |
n x p matrix or data frame containing the cases for each feature. The rows correspond to cases and the columns to the features. Intercept column is not included in this. |
y |
n x 1 vector representing the outcome variable. Each element denotes which one of the k classes that case belongs to. |
kfold |
The number of folds to use for the k-fold validation for each set of λ_1 and λ_2 |
lam.vec.1 |
A vector containing the set of all values of λ_1, from which VDA will be conducted. To use only Euclidean penalization, set |
lam.vec.2 |
A vector containing the set of all values of λ_2, from which VDA will be conducted. vda.le is relatively insensitive to lambda values, so it is recommended that a vector of few values is used. The default value is 0.01. To use only Lasso penalization, set |
For each pair of (λ_1,λ_2), k-fold cross-validation will be conducted and the corresponding average testing error over the k folds will be recorded. λ_1 represents the parameter for the lasso penalization, while λ_2 represents the parameter for the group euclidean penalization. To use only Lasso penalization, set lam.vec.2
=0. To use only Euclidean penalization, set lam.vec.1
=0. The optimal pair is considered the pair of values that give the smallest testing error over the cross validation.
To view a plot of the cross validation errors across lambda values, see plot.cv.vda.le
.
kfold |
The number of folds used in k-fold cross validation |
lam.vec.1 |
The user supplied vector of λ_1 values |
lam.vec.2 |
The user supplied vector of λ_2 values |
error.cv |
A matrix of average testing errors. The rows correspond to λ_1 values and the columns correspond to λ_2 values. |
lam.opt |
The pair of λ_1 and λ_2 values that return the lowest testing error across k-fold cross validation. |
Edward Grant, Xia Li, Kenneth Lange, Tong Tong Wu
Maintainer: Edward Grant edward.m.grant@gmail.com
Wu, T.T. and Lange, K. (2010) Multicategory Vertex Discriminant Analysis for High-Dimensional Data. Annals of Applied Statistics, Volume 4, No 4, 1698-1721.
Lange, K. and Wu, T.T. (2008) An MM Algorithm for Multicategory Vertex Discriminant Analysis. Journal of Computational and Graphical Statistics, Volume 17, No 3, 527-544.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 | ### load zoo data
### column 1 is name, columns 2:17 are features, column 18 is class
data(zoo)
### feature matrix
x <- zoo[,2:17]
### class vector
y <- zoo[,18]
### lambda vector
lam1 <- (1:5)/100
lam2 <- (1:5)/100
### Searching for the best pair, using both lasso and euclidean penalizations
cv <- cv.vda.le(x, y, kfold = 3, lam.vec.1 = exp(1:5)/10000, lam.vec.2 = (1:5)/100)
plot(cv)
outLE <- vda.le(x,y,cv$lam.opt[1],cv$lam.opt[2])
### To search for the best pair, using ONLY lasso penalization, set lambda2=0 (remove comments)
#cvlasso <- cv.vda.le(x, y, kfold = 3, lam.vec.1 = exp(1:10)/1000, lam.vec.2 = 0)
#plot(cvlasso)
#cvlasso$lam.opt
### To search for the best pair, using ONLY euclidean penalization, set lambda1=0 (remove comments)
#cveuclidian <- cv.vda.le(x, y, kfold = 3, lam.vec.1 = 0, lam.vec.2 = exp(1:10)/1000)
#plot(cveuclidian)
#cveuclidian$lam.opt
### Predict five cases based on vda.le (Lasso and Euclidean penalties)
fivecases <- matrix(0,5,16)
fivecases[1,] <- c(1,0,0,1,0,0,0,1,1,1,0,0,4,0,1,0)
fivecases[2,] <- c(1,0,0,1,0,0,1,1,1,1,0,0,4,1,0,1)
fivecases[3,] <- c(0,1,1,0,1,0,0,0,1,1,0,0,2,1,1,0)
fivecases[4,] <- c(0,0,1,0,0,1,1,1,1,0,0,1,0,1,0,0)
fivecases[5,] <- c(0,0,1,0,0,0,1,0,0,0,0,0,0,0,0,0)
predict(outLE, fivecases)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.