View source: R/cv.glmnetr_250418.R
cv.glmnetr | R Documentation |
Derive an elastic net (including a relaxed lasso) model and identify hyperparameters, i.e. alpha, gamma and lambda, which give the best fit based upon cross validation. It is analogous to (and uses) the cv.glmnet() function of the 'glmnet' package, but also tunes on alpha.
cv.glmnetr(
trainxs,
trainy__,
family,
alpha = 1,
gamma = c(0, 0.25, 0.5, 0.75, 1),
lambda = NULL,
foldid = NULL,
folds_n = NULL,
fine = 0,
path = 0,
track = 0,
...
)
trainxs |
predictor matrix |
trainy__ |
outcome vector |
family |
model family, "cox", "binomial" or "gaussian" (default) |
alpha |
A vector for alpha values considetered when tuning, for example c(0,0.2,0.4,0.6,0.8,1). Default is c(1) to fit the lasso model involving only the L1 penalty. c(0) could be used to ffit the reidge model involving only the L2 penalty. |
gamma |
the gamma vector. Default is c(0,0.25,0.50,0.75,1). |
lambda |
the lambda vector. May be NULL. |
foldid |
a vector of integers to associate each record to a fold. The integers should be between 1 and folds_n. |
folds_n |
number of folds for cross validation. Default and generally recommended is 10. |
fine |
use a finer step in determining lambda. Of little value unless one repeats the cross validation many times to more finely tune the hyperparameters. See the 'glmnet' package documentation. |
path |
The path option from cv.glmnet(). 0 for FALSE reducing computation time when the numerics are stable, 1 to avoid cases where the path = 0 option might get very slow. 0 by default. |
track |
indicate whether or not to update progress in the console. Default of 0 suppresses these updates. The option of 1 provides these updates. In fitting clinical data with non full rank design matrix we have found some R-packages to take a vary long time or seemingly be caught in infinite loops. Therefore we allow the user to track the program progress and judge whether things are moving forward or if the process should be stopped. |
... |
Additional arguments that can be passed to cv.glmnet() |
This is the main program for model derivation. As currently implemented the package requires the data to be input as vectors and matrices with no missing values (NA). All data vectors and matrices must be numerical. For factors (categorical variables) one should first construct corresponding numerical variables to represent the factor levels. To take advantage of the lasso model, one can use one hot coding assigning an indicator for each level of each categorical variable, or creating as well other contrasts variables suggested by the subject matter.
A cross validation informed relaxed lasso model fit.
Walter Kremers (kremers.walter@mayo.edu)
summary.cv.glmnetr
, predict.cv.glmnetr
, nested.glmnetr
# set seed for random numbers, optionally, to get reproducible results
set.seed(82545037)
sim.data=glmnetr.simdata(nrows=200, ncols=100, beta=NULL)
xs=sim.data$xs
y_=sim.data$y_
event=sim.data$event
# for this example we use a small number for folds_n to shorten run time
cv.glmnetr.fit = nested.glmnetr(xs, y_=y_, family="gaussian", folds_n=4, resample=0)
plot(cv.glmnetr.fit)
plot(cv.glmnetr.fit, coefs=1)
summary(cv.glmnetr.fit)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.