cv.sparseSVM: Cross validation for sparseSVM
In sparseSVM: Solution Paths of Sparse High-Dimensional Support Vector Machine with Lasso or Elastic-Net Regularization

View source: R/cv.sparseSVM.R

cv.sparseSVM

R Documentation

Cross validation for sparseSVM

Description

Perform k-fold cross validation for sparse linear SVM regularized by lasso or elastic-net over a sequence of lambda values and find an optimal lambda.

Usage

cv.sparseSVM(X, y, ..., ncores = 1, eval.metric = c("me"),
             nfolds = 10, fold.id, seed, trace = FALSE)

Arguments

`X`	Input matrix.
`y`	Response vector.
`...`	Additional arguments to `sparseSVM`.
`ncores`	`cv.sparseSVM` can be run in parallel across a cluster using the `parallel` package. If `ncores > 1`,a cluster is created to run `cv.sparseSVM` in parallel. The code is run in series if `ncores = 1` (the default). An error occurs if `ncores` is larger than the total number of available cores.
`eval.metric`	The metric used to choose optimial `lambda`. Current version only supports "me": misclassification error.
`nfolds`	The number of cross-validation folds. Default is 10.
`seed`	The seed of the random number generator in order to obtain reproducible results.
`fold.id`	Which fold each observation belongs to. By default the observations are randomly assigned by `cv.sparseSVM`.
`trace`	If set to TRUE, cv.sparseSVM will inform the user of its progress by announcing the beginning of each CV fold. Default is FALSE. (No trace output when running in parallel even if `trace=TRUE`.)

Details

The function randomly partitions the data in nfolds. It calls sparseSVM nfolds+1 times, the first to obtain the lambda sequence, and the remainder to fit with each of the folds left out once for validation. The cross-validation error is the average of validation errors for the nfolds fits.

Note by default, the cross-validation fold assignments are balanced across the two classes, so that each fold has the same class proportion (or as close to the same proportion as it is possible to achieve if cases do not divide evenly).

Value

The function returns an object of S3 class "cv.sparseSVM", which is a list containing:

`cve`	The validation error for each value of `lambda`, averaged across the cross-validation folds.
`cvse`	The estimated standard error associated with each value of `cve`.
`lambda`	The values of lambda used in the cross-validation fits.
`fit`	The fitted `sparseSVM` object for the whole data.
`min`	The index of `lambda` corresponding to `lambda.min`.
`lambda.min`	The value of `lambda` with the minimum cross-validation error in terms of `eval.metric`.
`eval.metric`	The metric used in selecting optimal `lambda`.
`fold.id`	The same as above.

Author(s)

Congrui Yi and Yaohui Zeng
Maintainer: Congrui Yi <eric.ycr@gmail.com>

Examples

X = matrix(rnorm(1000*100), 1000, 100)
b = 3
w = 5*rnorm(10)
eps = rnorm(1000)
y = sign(b + drop(X[,1:10] %*% w + eps))

cv.fit1 <- cv.sparseSVM(X, y, nfolds = 5, ncores = 2, seed = 1234)
cv.fit2 <- cv.sparseSVM(X, y, nfolds = 5, seed = 1234)
stopifnot(all.equal(cv.fit1, cv.fit2))

sparseSVM documentation built on Sept. 27, 2024, 1:06 a.m.