View source: R/dCVnet_utilities.R
amalgamate_cv.glmnet | R Documentation |
Gathers results from a list of cv.glmnet
objects
and returns a merged, averaged object.
amalgamate_cv.glmnet(
cvglmlist,
checks = list(alpha = TRUE, lambda = FALSE, type.measure = TRUE)
)
cvglmlist |
a list of cv.glmnet models |
checks |
which input checks to run |
The arithmetic mean k-fold cross-validated loss (i.e. type.measure) is taken over the models (with the sd averaged via variance). The cv SE upper and lower limits (used in lambda.1se calculation) are then calculated from on the averaged data and finally the cv optimal lambda.1se and lambda.min values calculated for the averaged performance.
Consistent with cv.glmnet, the model coefficients within folds are not
made available, averaged or otherwise investigable, but a whole data model
is returned in the glmnet.fit
slot.
The cvglmlist must contain cv.glmnet models suitable for averaging together. This typically means all models having the same:
family
x and y data
alpha value
lambda sequence
type.measure
number of k-fold CV folds
other cv.glmnet options
in order for the amalgamated results to "make sense". Essentially the models in the list should only differ on the random allocation of folds to cases (usually specified in foldid).
Some limited checks are implemented to ensure alpha, lambda and type.measure are identical. There is an option to turn these checks off, but this is not recommended.
This function presently does not honour the "keep" argument of cv.glmnet and all additional arrays/vectors are silently dropped.
an object of class "cv.glmnet"
is returned, which is a list
with the ingredients of the cross-validation fit. If the object was created
with relax=TRUE
then this class has a prefix class of
"cv.relaxed"
.
lambda |
the values of |
cvm |
The mean cross-validated error - a vector of length
|
cvsd |
estimate of standard error of
|
cvup |
upper curve = |
cvlo |
lower
curve = |
nzero |
number of non-zero coefficients at
each |
name |
a text string indicating type of measure (for plotting purposes). |
glmnet.fit |
a fitted glmnet object for the full data. |
lambda.min |
value of |
lambda.1se |
largest value of |
fit.preval |
if
|
foldid |
if |
index |
a one column matrix with the indices of |
relaxed |
if |
cv.glmnet
## Not run:
data("CoxExample", package = "glmnet") # x and y
# folds for unstratified 10x-repeated 5-fold cv:
foldlist <- replicate(10,
sample(1:5, size = NROW(CoxExample$x), replace = TRUE),
simplify = FALSE)
names(foldlist) <- paste0("Rep", 1:10) # label the replications.
lambdaseq <- glmnet::cv.glmnet(x=CoxExample$x,
y=CoxExample$y, family = "cox")$lambda
# create a list of models:
modellist <- lapply(foldlist, function(ff) {
glmnet::cv.glmnet(x = CoxExample$x, y = CoxExample$y,
family = "cox", foldid = ff,
lambda = lambdaseq) } )
# use amalgamate to average results:
mod <- amalgamate_cv.glmnet(modellist)
# compare rep-rep performance variability with the average performance:
# rep1:
plot(modellist[[1]], main = "rep1")
# rep2:
plot(modellist[[2]], main = "rep2")
# etc.
# mean:
plot(mod, main = "averaged")
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.