amalgamate_cv.glmnet: amalgamate_cv.glmnet
In AndrewLawrence/dCVnet: doubly cross-validated elastic-net regularised generalised linear models

amalgamate_cv.glmnet

R Documentation

amalgamate_cv.glmnet

Description

Gathers results from a list of cv.glmnet objects and returns a merged, averaged object.

Usage

amalgamate_cv.glmnet(
  cvglmlist,
  checks = list(alpha = TRUE, lambda = FALSE, type.measure = TRUE)
)

Arguments

`cvglmlist`	a list of cv.glmnet models
`checks`	which input checks to run

Details

The arithmetic mean k-fold cross-validated loss (i.e. type.measure) is taken over the models (with the sd averaged via variance). The cv SE upper and lower limits (used in lambda.1se calculation) are then calculated from on the averaged data and finally the cv optimal lambda.1se and lambda.min values calculated for the averaged performance.

Consistent with cv.glmnet, the model coefficients within folds are not made available, averaged or otherwise investigable, but a whole data model is returned in the glmnet.fit slot.

The cvglmlist must contain cv.glmnet models suitable for averaging together. This typically means all models having the same:

family
x and y data
alpha value
lambda sequence
type.measure
number of k-fold CV folds
other cv.glmnet options

in order for the amalgamated results to "make sense". Essentially the models in the list should only differ on the random allocation of folds to cases (usually specified in foldid).

Some limited checks are implemented to ensure alpha, lambda and type.measure are identical. There is an option to turn these checks off, but this is not recommended.

This function presently does not honour the "keep" argument of cv.glmnet and all additional arrays/vectors are silently dropped.

Value

an object of class "cv.glmnet" is returned, which is a list with the ingredients of the cross-validation fit. If the object was created with relax=TRUE then this class has a prefix class of "cv.relaxed".

`lambda`	the values of `lambda` used in the fits.
`cvm`	The mean cross-validated error - a vector of length `length(lambda)`.
`cvsd`	estimate of standard error of `cvm`.
`cvup`	upper curve = `cvm+cvsd`.
`cvlo`	lower curve = `cvm-cvsd`.
`nzero`	number of non-zero coefficients at each `lambda`.
`name`	a text string indicating type of measure (for plotting purposes).
`glmnet.fit`	a fitted glmnet object for the full data.
`lambda.min`	value of `lambda` that gives minimum `cvm`.
`lambda.1se`	largest value of `lambda` such that error is within 1 standard error of the minimum.
`fit.preval`	if `keep=TRUE`, this is the array of prevalidated fits. Some entries can be `NA`, if that and subsequent values of `lambda` are not reached for that fold
`foldid`	if `keep=TRUE`, the fold assignments used
`index`	a one column matrix with the indices of `lambda.min` and `lambda.1se` in the sequence of coefficients, fits etc.
`relaxed`	if `relax=TRUE`, this additional item has the CV info for each of the mixed fits. In particular it also selects `lambda, gamma` pairs corresponding to the 1se rule, as well as the minimum error. It also has a component `index`, a two-column matrix which contains the `lambda` and `gamma` indices corresponding to the "min" and "1se" solutions.

Examples

## Not run: 
data("CoxExample", package = "glmnet") # x and y
# folds for unstratified 10x-repeated 5-fold cv:
foldlist <- replicate(10,
sample(1:5, size = NROW(CoxExample$x), replace = TRUE),
simplify = FALSE)
names(foldlist) <- paste0("Rep", 1:10) # label the replications.
lambdaseq <- glmnet::cv.glmnet(x=CoxExample$x,
    y=CoxExample$y, family = "cox")$lambda
# create a list of models:
modellist <- lapply(foldlist, function(ff) {
glmnet::cv.glmnet(x = CoxExample$x, y = CoxExample$y,
family = "cox", foldid = ff,
    lambda = lambdaseq) } )

# use amalgamate to average results:
mod <- amalgamate_cv.glmnet(modellist)

# compare rep-rep performance variability with the average performance:
# rep1:
plot(modellist[[1]], main = "rep1")
# rep2:
plot(modellist[[2]], main = "rep2")
# etc.
# mean:
plot(mod, main = "averaged")

## End(Not run)

AndrewLawrence/dCVnet documentation built on Sept. 24, 2024, 5:24 a.m.

AndrewLawrence/dCVnet index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

AndrewLawrence/dCVnet
doubly cross-validated elastic-net regularised generalised linear models

amalgamate_cv.glmnet: amalgamate_cv.glmnet
In AndrewLawrence/dCVnet: doubly cross-validated elastic-net regularised generalised linear models