s_GLMNET: GLM with Elastic Net Regularization [C, R, S]

View source: R/s_GLMNET.R

s_GLMNETR Documentation

GLM with Elastic Net Regularization [C, R, S]

Description

Train an elastic net model

Usage

s_GLMNET(
  x,
  y = NULL,
  x.test = NULL,
  y.test = NULL,
  x.name = NULL,
  y.name = NULL,
  grid.resample.params = setup.resample("kfold", 5),
  gridsearch.type = c("exhaustive", "randomized"),
  gridsearch.randomized.p = 0.1,
  intercept = TRUE,
  nway.interactions = 0,
  family = NULL,
  alpha = seq(0, 1, 0.2),
  lambda = NULL,
  nlambda = 100,
  which.cv.lambda = c("lambda.1se", "lambda.min"),
  penalty.factor = NULL,
  weights = NULL,
  ifw = TRUE,
  ifw.type = 2,
  upsample = FALSE,
  downsample = FALSE,
  resample.seed = NULL,
  res.summary.fn = mean,
  metric = NULL,
  maximize = NULL,
  .gs = FALSE,
  n.cores = rtCores,
  print.plot = FALSE,
  plot.fitted = NULL,
  plot.predicted = NULL,
  plot.theme = rtTheme,
  question = NULL,
  verbose = TRUE,
  outdir = NULL,
  save.mod = ifelse(!is.null(outdir), TRUE, FALSE),
  ...
)

Arguments

x

Numeric vector or matrix / data frame of features i.e. independent variables

y

Numeric vector of outcome, i.e. dependent variable

x.test

Numeric vector or matrix / data frame of testing set features Columns must correspond to columns in x

y.test

Numeric vector of testing set outcome

x.name

Character: Name for feature set

y.name

Character: Name for outcome

grid.resample.params

List: Output of setup.resample defining grid search parameters.

gridsearch.type

Character: Type of grid search to perform: "exhaustive" or "randomized".

gridsearch.randomized.p

Float (0, 1): If gridsearch.type = "randomized", randomly test this proportion of combinations.

intercept

Logical: If TRUE, include intercept in the model.

nway.interactions

Integer: Number of n-way interactions to include in the model.

family

Error distribution and link function. See stats::family

alpha

[gS] Float [0, 1]: The elasticnet mixing parameter: a = 0 is the ridge penalty, a = 1 is the lasso penalty

lambda

[gS] Float vector: Best left to NULL, cv.glmnet will compute its own lambda sequence

nlambda

Integer: Number of lambda values to compute

which.cv.lambda

Character: Which lambda to use for prediction: "lambda.1se" or "lambda.min"

penalty.factor

Float vector: Multiply the penalty for each coefficient by the values in this vector. This is most useful for specifying different penalties for different groups of variables

weights

Numeric vector: Weights for cases. For classification, weights takes precedence over ifw, therefore set weights = NULL if using ifw. Note: If weight are provided, ifw is not used. Leave NULL if setting ifw = TRUE.

ifw

Logical: If TRUE, apply inverse frequency weighting (for Classification only). Note: If weights are provided, ifw is not used.

ifw.type

Integer 0, 1, 2 1: class.weights as in 0, divided by min(class.weights) 2: class.weights as in 0, divided by max(class.weights)

upsample

Logical: If TRUE, upsample cases to balance outcome classes (for Classification only) Note: upsample will randomly sample with replacement if the length of the majority class is more than double the length of the class you are upsampling, thereby introducing randomness

downsample

Logical: If TRUE, downsample majority class to match size of minority class

resample.seed

Integer: If provided, will be used to set the seed during upsampling. Default = NULL (random seed)

res.summary.fn

Function: Used to average resample runs.

metric

Character: Metric to minimize, or maximize if maximize = TRUE during grid search. Default = NULL, which results in "Balanced Accuracy" for Classification, "MSE" for Regression, and "Coherence" for Survival Analysis.

maximize

Logical: If TRUE, metric will be maximized if grid search is run.

.gs

(Internal use only)

n.cores

Integer: Number of cores to use.

print.plot

Logical: if TRUE, produce plot using mplot3 Takes precedence over plot.fitted and plot.predicted.

plot.fitted

Logical: if TRUE, plot True (y) vs Fitted

plot.predicted

Logical: if TRUE, plot True (y.test) vs Predicted. Requires x.test and y.test

plot.theme

Character: "zero", "dark", "box", "darkbox"

question

Character: the question you are attempting to answer with this model, in plain language.

verbose

Logical: If TRUE, print summary to screen.

outdir

Path to output directory. If defined, will save Predicted vs. True plot, if available, as well as full model output, if save.mod is TRUE

save.mod

Logical: If TRUE, save all output to an RDS file in outdir save.mod is TRUE by default if an outdir is defined. If set to TRUE, and no outdir is defined, outdir defaults to paste0("./s.", mod.name)

...

Additional arguments

Details

s_GLMNET runs glmnet::cv.glmnet for each value of alpha, for each resample in grid.resample.params. Mean values for min.lambda and MSE (Regression) or Accuracy (Classification) are aggregated for each alpha and resample combination

⁠\[gS\]⁠ Indicates tunable hyperparameters: If more than a single value is provided, grid search will be automatically performed

Author(s)

E.D. Gennatas

See Also

train_cv for external cross-validation

Other Supervised Learning: s_AdaBoost(), s_AddTree(), s_BART(), s_BRUTO(), s_BayesGLM(), s_C50(), s_CART(), s_CTree(), s_EVTree(), s_GAM(), s_GAM.default(), s_GAM.formula(), s_GBM(), s_GLM(), s_GLMTree(), s_GLS(), s_H2ODL(), s_H2OGBM(), s_H2ORF(), s_HAL(), s_KNN(), s_LDA(), s_LM(), s_LMTree(), s_LightCART(), s_LightGBM(), s_MARS(), s_MLRF(), s_NBayes(), s_NLA(), s_NLS(), s_NW(), s_PPR(), s_PolyMARS(), s_QDA(), s_QRNN(), s_RF(), s_RFSRC(), s_Ranger(), s_SDA(), s_SGD(), s_SPLS(), s_SVM(), s_TFN(), s_XGBoost(), s_XRF()

Other Interpretable models: s_AddTree(), s_C50(), s_CART(), s_GLM(), s_GLMTree(), s_LMTree()


egenn/rtemis documentation built on May 4, 2024, 7:40 p.m.