lasso: LASSO for regression and classification.

Description Usage Arguments Details Value Examples

View source: R/lasso.R

Description

lasso implements standard LASSO regression and classification.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
lasso(
  train_df,
  formula,
  family,
  test_df = NULL,
  predict_df = NULL,
  free_vars = NULL,
  nfold = NULL,
  lambda = "min",
  sparsity_threshold = NULL,
  verbose = FALSE,
  ...
)

Arguments

train_df

An input dataframe with y and X.

formula

A formula for the regression specification.

family

A string that specifies either 'gaussian' or 'binomial'.

test_df

A dataframe containing the same columns as train_df. The training set.

predict_df

A dataframe matching train_df. This is to generate predictions using the trained & tested model. This argument is optional.

free_vars

A string or character vector specifying which covariate(s) to never penalize. This argument is optional.

nfold

The number of cross-validation folds. Only specify if cross-validation is desired. This argument is optional.

lambda

A string specifying which lambda to use for prediction. Typically either "min" or "1se". Default value is "min".

sparsity_threshold

A numeric value in [0, 1]. Any variable with a percentage of sparsity greater than this value will be dropped. This argument is optional.

verbose

Logical indicating whether to return progress statements. Default is FALSE.

...

Generic argument to which you can pass any other valid gamlr argument, such as standardize = FALSE.

Details

The lasso function implements LASSO regression, as found in the gamlr package, for variable selection and prediction. It handles standard OLS regression and binomial logistic regression.

Value

A list containing the LASSO model, predicted_values, residuals, and selected variables.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
## Not run: 
idx <- train_test_validate(iris$Sepal.Length, train.p = .6, test.p = .2)

lasso_model <- lasso(train_df = iris[idx$train, ],
                     formula = Sepal.Length ~ .,
                     family = "gaussian",
                     test_df = iris[idx$test, ],
                     predict_df = iris[idx$validate, ],
                     nfold = 5,
                     verbose = TRUE)

## End(Not run)

dmolitor/umbrella documentation built on Nov. 10, 2020, 1:25 a.m.