cv.grpreg | R Documentation |
Performs k-fold cross validation for penalized regression models with grouped covariates over a grid of values for the regularization parameter lambda.
cv.grpreg(
X,
y,
group = 1:ncol(X),
...,
nfolds = 10,
seed,
fold,
returnY = FALSE,
trace = FALSE
)
cv.grpsurv(
X,
y,
group = 1:ncol(X),
...,
nfolds = 10,
seed,
fold,
se = c("quick", "bootstrap"),
returnY = FALSE,
trace = FALSE
)
X |
The design matrix, as in |
y |
The response vector (or matrix), as in |
group |
The grouping vector, as in |
... |
Additional arguments to |
nfolds |
The number of cross-validation folds. Default is 10. |
seed |
You may set the seed of the random number generator in order to obtain reproducible results. |
fold |
Which fold each observation belongs to. By default the observations are randomly assigned. |
returnY |
Should cv.grpreg()/cv.grpsurv() return the fitted
values from the cross-validation folds? Default is FALSE; if TRUE, this
will return a matrix in which the element for row i, column j is the fitted
value for observation i from the fold in which observation i was excluded
from the fit, at the jth value of lambda. NOTE: For |
trace |
If set to TRUE, cv.grpreg will inform the user of its progress by announcing the beginning of each CV fold. Default is FALSE. |
se |
For |
The function calls grpreg()
or grpsurv()
nfolds
times, each
time leaving out 1/nfolds
of the data. The cross-validation error is
based on the deviance;
see here for more details.
For Gaussian and Poisson responses, the folds are chosen according to simple
random sampling. For binomial responses, the numbers for each outcome class
are balanced across the folds; i.e., the number of outcomes in which
y
is equal to 1 is the same for each fold, or possibly off by 1 if
the numbers do not divide evenly. This approach is used for Cox regression
as well to balance the amount of censoring cross each fold.
For Cox models, cv.grpsurv
uses the approach of calculating the full
Cox partial likelihood using the cross-validated set of linear predictors.
Other approaches to cross-validation for the Cox regression model have been
proposed in the literature; the strengths and weaknesses of the various
methods for penalized regression in the Cox model are the subject of current
research. A simple approximation to the standard error is provided,
although an option to bootstrap the standard error (se='bootstrap'
)
is also available.
As in grpreg()
, seemingly unrelated regressions/multitask learning can
be carried out by setting y
to be a matrix, in which case groups are
set up automatically (see grpreg()
for details), and
cross-validation is carried out with respect to rows of y
. As
mentioned in the details there, it is recommended to standardize the
responses prior to fitting.
An object with S3 class "cv.grpreg"
containing:
cve |
The error for each value of |
cvse |
The estimated standard error associated
with each value of for |
lambda |
The sequence of regularization parameter values along which the cross-validation error was calculated. |
fit |
The fitted |
fold |
The fold assignments for cross-validation for each observation;
note that for |
min |
The index of
|
lambda.min |
The
value of |
null.dev |
The deviance for the intercept-only model. |
pe |
If |
Patrick Breheny
grpreg()
, plot.cv.grpreg()
, summary.cv.grpreg()
,
predict.cv.grpreg()
data(Birthwt)
X <- Birthwt$X
y <- Birthwt$bwt
group <- Birthwt$group
cvfit <- cv.grpreg(X, y, group)
plot(cvfit)
summary(cvfit)
coef(cvfit) ## Beta at minimum CVE
cvfit <- cv.grpreg(X, y, group, penalty="gel")
plot(cvfit)
summary(cvfit)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.