PenalizedLDA.cv: Perform cross-validation for penalized linear discriminant...

Description Usage Arguments Value Author(s) References Examples

Description

Performs cross-validation for PenalizedLDA function.

Usage

1
2
PenalizedLDA.cv(x, y, lambdas = NULL, K = NULL, nfold = 6, folds = NULL,
    type = "standard", chrom = NULL, lambda2 = NULL)

Arguments

x

A nxp data matrix; n is the number of observations and p is the number of features.

y

A n-vector y containing class labels, represented as 1, 2, . . . , nclasses.

lambdas

A vector of lambda values to be considered.

K

The number of discriminant vectors to be used. If K is not specified, then cross-validation will be performed in order to choose the number of discriminant vectors to use.

nfold

Number of cross-validation folds.

folds

Optional - one can pass in a list containing the observations that should be used as the test set in each cross-validation fold.

type

Either "standard" or "ordered". The former will result in the use of lasso penalties, and the latter will result in fused lasso penalties. "Ordered" is appropriate if the features are ordered and it makes sense for the discriminant vector(s) to preserve that ordering.

chrom

Only applies to type="ordered". Should be used only if the p features correspond to chromosomal locations. In this case, a numeric vector of length p indicating which "chromosome" each feature belongs to. The purpose is to avoid imposing smoothness between chromosomes.

lambda2

If type is "ordered", enter the value of lambda2 to be used. Note that cross-validation is performed over lambda (and possibly over K) but not over lambda2.

Value

errs

The mean cross-validation error rates obtained. Either a vector of length equal to length(lambdas) or a length(lambdas)x(length(unique(y))-1) matrix. The former will occur if K is specified and the latter will occur otherwise, in which case cross-validation occurred over K as well as over lambda.

nnonzero

A vector or matrix of the same dimension as "errs". Entries indicate the number of nonzero features involved in the corresponding classifier.

bestK

Value of K(= number of discriminant vectors) that minimizes the cross-validation error.

bestlambda

Value of "lambdas" that minimizes the cross-validation error.

bestlambda.1se

Given that K equals bestK, this is the largest value of lambda such that the corresponding error is within 1 standard error of the minimum. This is the "one standard error" rule for selecting the tuning parameter.

lambdas

Values of lambda considered.

Ks

Values of K considered - only output if K=NULL was input.

folds

Folds used in cross-validation.

Author(s)

Daniela M. Witten

References

D Witten and R Tibshirani (2011) Penalized classification using Fisher's linear discriminant. To appear in JRSSB.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
# Generate data #
set.seed(1)
n <- 20 # number of training obs
m <- 40 # number of test obs
p <- 100 # number of features
x <- matrix(rnorm(n*p), ncol=p)
xte <- matrix(rnorm(m*p), ncol=p)
y <- c(rep(1,5),rep(2,5),rep(3,6), rep(4,4))
yte <- rep(1:4, each=10) 
x[y==1,1:10] <- x[y==1,1:10] + 2
x[y==2,11:20] <- x[y==2,11:20] - 2
x[y==3,21:30] <- x[y==3,21:30] - 2.5
xte[yte==1,1:10] <- xte[yte==1,1:10] + 2
xte[yte==2,11:20] <- xte[yte==2,11:20] - 2
xte[yte==3,21:30] <- xte[yte==3,21:30] - 2.5


# Perform cross-validation #
# Use type="ordered" -- that is, we are assuming that the features have
# some sort of spatial structure
cv.out <-
PenalizedLDA.cv(x,y,type="ordered",lambdas=c(1e-4,1e-3,1e-2,.1,1,10),lambda2=.3)
print(cv.out)
plot(cv.out)
# Perform penalized LDA #
out <- PenalizedLDA(x,y,xte=xte,type="ordered", lambda=cv.out$bestlambda,
K=cv.out$bestK, lambda2=.3)
print(out)
plot(out)
print(table(out$ypred[,out$K],yte))




# Now repeat penalized LDA computations but this time use
# type="standard"  - i.e. don't exploit spatial structure
# Perform cross-validation #
cv.out <-
PenalizedLDA.cv(x,y,lambdas=c(1e-4,1e-3,1e-2,.1,1,10))
print(cv.out)
plot(cv.out)
# Perform penalized LDA #
out <- PenalizedLDA(x,y,xte=xte,lambda=cv.out$bestlambda,K=cv.out$bestK)
print(out)
plot(out)
print(table(out$ypred[,out$K],yte))

penalizedLDA documentation built on May 2, 2019, 8:36 a.m.