cv.ncvreg  R Documentation 
Performs kfold cross validation for MCP or SCADpenalized regression models over a grid of values for the regularization parameter lambda.
cv.ncvreg(
X,
y,
...,
cluster,
nfolds = 10,
seed,
fold,
returnY = FALSE,
trace = FALSE
)
cv.ncvsurv(
X,
y,
...,
cluster,
nfolds = 10,
seed,
fold,
se = c("quick", "bootstrap"),
returnY = FALSE,
trace = FALSE
)
X 
The design matrix, without an intercept, as in 
y 
The response, as in 
... 
Additional arguments to 
cluster 

nfolds 
The number of crossvalidation folds. Default is 10. 
seed 
You may set the seed of the random number generator in order to obtain reproducible results. 
fold 
Which fold each observation belongs to. By default the observations are randomly assigned. 
returnY 
Should 
trace 
If set to 
se 
For 
The function calls ncvreg
/ncvsurv
nfolds
times, each
time leaving out 1/nfolds
of the data. The crossvalidation error is
based on the deviance; see here for more details.
For family="binomial"
models, the crossvalidation fold assignments are
balanced across the 0/1 outcomes, so that each fold has the same proportion
of 0/1 outcomes (or as close to the same proportion as it is possible to
achieve if cases do not divide evenly).
For Cox models, cv.ncvsurv()
uses the approach of calculating the full
Cox partial likelihood using the crossvalidated set of linear predictors.
Other approaches to crossvalidation for the Cox regression model have been
proposed in the literature; the strengths and weaknesses of the various
methods for penalized regression in the Cox model are the subject of current
research. A simple approximation to the standard error is provided,
although an option to bootstrap the standard error (se='bootstrap'
) is also
available.
An object with S3 class cv.ncvreg
or cv.ncvsurv
containing:
The error for each value of lambda
, averaged across the cross
validation folds.
The estimated standard error associated with each value of for cve
.
The fold assignments for crossvalidation for each observation;
note that for cv.ncvsurv()
, these are in terms of the ordered observations,
not the original observations.
The sequence of regularization parameter values along which the crossvalidation error was calculated.
The fitted ncvreg()
or ncvsurv()
object for the whole data.
The index of lambda
corresponding to lambda.min
.
The value of lambda
with the minimum crossvalidation error.
The deviance for the interceptonly model. If you have supplied
your own lambda
sequence, this quantity may not be meaningful.
The estimated bias of the minimum crossvalidation error, as in Tibshirani and Tibshirani (2009) \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1214/08AOAS224")}
If family="binomial"
, the crossvalidation prediction error for
each value of lambda
.
If returnY=TRUE
, the matrix of crossvalidated fitted values (see above).
Patrick Breheny; Grant Brown helped with the parallelization support
Breheny P and Huang J. (2011) Coordinate descent algorithms for nonconvex penalized regression, with applications to biological feature selection. Annals of Applied Statistics, 5: 232253. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1214/10AOAS388")}
ncvreg()
, plot.cv.ncvreg()
, summary.cv.ncvreg()
data(Prostate)
cvfit < cv.ncvreg(Prostate$X, Prostate$y)
plot(cvfit)
summary(cvfit)
fit < cvfit$fit
plot(fit)
beta < fit$beta[,cvfit$min]
## requires loading the parallel package
## Not run:
library(parallel)
X < Prostate$X
y < Prostate$y
cl < makeCluster(4)
cvfit < cv.ncvreg(X, y, cluster=cl, nfolds=length(y))
## End(Not run)
# Survival
data(Lung)
X < Lung$X
y < Lung$y
cvfit < cv.ncvsurv(X, y)
summary(cvfit)
plot(cvfit)
plot(cvfit, type="rsq")
