crossval | R Documentation |
Function that computes K-fold (double) cross-validated error of a
quadrupen
fit. If no lambda2
is provided, simple
cross validation on the lambda1
parameter is performed. If
a vector lambda2
is passed as an argument, double
cross-validation is performed.
crossval( x, y, penalty = c("elastic.net", "bounded.reg"), K = 10, folds = split(sample(1:nrow(x)), rep(1:K, length = nrow(x))), lambda2 = 0.01, verbose = TRUE, mc.cores = 2, ... )
x |
matrix of features, possibly sparsely encoded (experimental). Do NOT include intercept. |
y |
response vector. |
penalty |
a string for the fitting procedure used for
cross-validation. Either |
K |
integer indicating the number of folds. Default is 10. |
folds |
list of |
lambda2 |
tunes the l2-penalty (ridge-like) of
the fit. If none is provided, the default scalar value of the
corresponding fitting method is used and a simple CV is
performed. If a vector of values is given, double cross-validation
is performed (both on |
verbose |
logical; indicates if the progression (the current
lambda2) should be displayed. Default is |
mc.cores |
the number of cores to use. The default uses 2 cores. |
... |
additional parameters to overwrite the defaults of the
fitting procedure identified by the |
An object of class "cvpen" for which a plot
method
is available.
If the user runs the fitting method with option
'bulletproof'
set to FALSE
, the algorithm may stop
at an early stage of the path. Early stops are handled internally,
in order to provide results on the same grid of penalty tuned by
lambda1. This is done by means of NA
values, so as mean and standard error are consistently
evaluated. If, while cross-validating, the procedure experiences
too many early stoppings, a warning is sent to the user, in which
case you should reconsider the grid of lambda1
used for the
cross-validation. If bulletproof
is TRUE
(the
default), there is nothing to worry about, except a possible slow
down when any switching to the proximal algorithm is required.
quadrupen
, plot,cvpen-method
and cvpen
.
## Simulating multivariate Gaussian with blockwise correlation ## and piecewise constant vector of parameters beta <- rep(c(0,1,0,-1,0), c(25,10,25,10,25)) cor <- 0.75 Soo <- toeplitz(cor^(0:(25-1))) ## Toeplitz correlation for irrelevant variable Sww <- matrix(cor,10,10) ## bloc correlation between active variables Sigma <- bdiag(Soo,Sww,Soo,Sww,Soo) + 0.1 diag(Sigma) <- 1 n <- 100 x <- as.matrix(matrix(rnorm(95*n),n,95) %*% chol(Sigma)) y <- 10 + x %*% beta + rnorm(n,0,10) ## Use fewer lambda1 values by overwritting the default parameters ## and cross-validate over the sequences lambda1 and lambda2 cv.double <- crossval(x,y, lambda2=10^seq(2,-2,len=50), nlambda1=50) ## Rerun simple cross-validation with the appropriate lambda2 cv.10K <- crossval(x,y, lambda2=0.2) ## Try leave one out also cv.loo <- crossval(x,y, K=n, lambda2=0.2) plot(cv.double) plot(cv.10K) plot(cv.loo) ## Performance for selection purpose beta.min.10K <- slot(cv.10K, "beta.min") beta.min.loo <- slot(cv.loo, "beta.min") cat("\nFalse positives with the minimal 10-CV choice: ", sum(sign(beta) != sign(beta.min.10K))) cat("\nFalse positives with the minimal LOO-CV choice: ", sum(sign(beta) != sign(beta.min.loo)))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.