easy_glmnet: Easily build and evaluate a penalized regression model.

Description Usage Arguments Value See Also Examples

View source: R/glmnet.R

Description

This function wraps the easyml core framework, allowing a user to easily run the easyml methodology for a glmnet model.

Usage

1
2
3
4
5
6
7
easy_glmnet(.data, dependent_variable, family = "gaussian", resample = NULL,
  preprocess = preprocess_scale, measure = NULL, exclude_variables = NULL,
  categorical_variables = NULL, train_size = 0.667, foldid = NULL,
  survival_rate_cutoff = 0.05, n_samples = 1000, n_divisions = 1000,
  n_iterations = 10, random_state = NULL, progress_bar = TRUE,
  n_core = 1, coefficients = TRUE, variable_importances = FALSE,
  predictions = TRUE, model_performance = TRUE, model_args = list())

Arguments

.data

A data.frame; the data to be analyzed.

dependent_variable

A character vector of length one; the dependent variable for this analysis.

family

A character vector of length one; the type of regression to run on the data. Choices are one of c("gaussian", "binomial"). Defaults to "gaussian".

resample

A function; the function for resampling the data. Defaults to NULL.

preprocess

A function; the function for preprocessing the data. Defaults to NULL.

measure

A function; the function for measuring the results. Defaults to NULL.

exclude_variables

A character vector; the variables from the data set to exclude. Defaults to NULL.

categorical_variables

A character vector; the variables that are categorical. Defaults to NULL.

train_size

A numeric vector of length one; specifies what proportion of the data should be used for the training data set. Defaults to 0.667.

foldid

A vector with length equal to length(y) which identifies cases belonging to the same fold.

survival_rate_cutoff

A numeric vector of length one; for easy_glmnet, specifies the minimal threshold (as a percentage) a coefficient must appear out of n_samples. Defaults to 0.05.

n_samples

An integer vector of length one; specifies the number of times the coefficients and predictions should be generated. Defaults to 1000.

n_divisions

An integer vector of length one; specifies the number of times the data should be divided when replicating the measures of model performance. Defaults to 1000.

n_iterations

An integer vector of length one; during each division, specifies the number of times the predictions should be generated. Defaults to 10.

random_state

An integer vector of length one; specifies the seed to be used for the analysis. Defaults to NULL.

progress_bar

A logical vector of length one; specifies whether to display a progress bar during calculations. Defaults to TRUE.

n_core

An integer vector of length one; specifies the number of cores to use for this analysis. Currently only works on Mac OSx and Unix/Linux systems. Defaults to 1.

coefficients

A logical vector of length one; whether or not to generate coefficients for this analysis.

variable_importances

A logical vector of length one; whether or not to generate variable importances for this analysis.

predictions

A logical vector of length one; whether or not to generate predictions for this analysis.

model_performance

A logical vector of length one; whether or not to generate measures of model performance for this analysis.

model_args

A list; the arguments to be passed to the algorithm specified.

Value

A list of class easy_glmnet.

See Also

Other recipes: easy_analysis, easy_avNNet, easy_deep_neural_network, easy_glinternet, easy_neural_network, easy_random_forest, easy_support_vector_machine

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
## Not run: 
library(easyml) # https://github.com/CCS-Lab/easyml

# Gaussian
data("prostate", package = "easyml")
results <- easy_glmnet(prostate, "lpsa", 
                       n_samples = 10, n_divisions = 10, 
                       n_iterations = 2, random_state = 12345, 
                       n_core = 1, model_args = list(alpha = 1.0))

# Binomial
data("cocaine_dependence", package = "easyml")
results <- easy_glmnet(cocaine_dependence, "diagnosis", 
                       family = "binomial", 
                       exclude_variables = c("subject"), 
                       categorical_variables = c("male"), 
                       preprocess = preprocess_scale, 
                       n_samples = 10, n_divisions = 10, 
                       n_iterations = 2, random_state = 12345, 
                       n_core = 1, model_args = list(alpha = 1.0))

## End(Not run)

Example output

Loaded easyml 0.1.0. Also loading ggplot2.
Loading required namespace: ggplot2
[1] "Generating coefficients from multiple model builds:"
[1] "Generating predictions for a single train test split:"
[1] "Generating measures of model performance over multiple train test splits:"
[1] "Generating coefficients from multiple model builds:"
[1] "Generating predictions for a single train test split:"
Setting levels: control = 0, case = 1
Setting direction: controls < cases
Setting levels: control = 0, case = 1
Setting direction: controls < cases
[1] "Generating measures of model performance over multiple train test splits:"
Setting levels: control = 0, case = 1
Setting direction: controls < cases
Setting levels: control = 0, case = 1
Setting direction: controls < cases
Setting levels: control = 0, case = 1
Setting direction: controls < cases
Setting levels: control = 0, case = 1
Setting direction: controls < cases
Setting levels: control = 0, case = 1
Setting direction: controls < cases
Setting levels: control = 0, case = 1
Setting direction: controls < cases
Setting levels: control = 0, case = 1
Setting direction: controls < cases
Setting levels: control = 0, case = 1
Setting direction: controls < cases
Setting levels: control = 0, case = 1
Setting direction: controls < cases
Setting levels: control = 0, case = 1
Setting direction: controls < cases
Setting levels: control = 0, case = 1
Setting direction: controls < cases
Setting levels: control = 0, case = 1
Setting direction: controls < cases
Setting levels: control = 0, case = 1
Setting direction: controls < cases
Setting levels: control = 0, case = 1
Setting direction: controls < cases
Setting levels: control = 0, case = 1
Setting direction: controls < cases
Setting levels: control = 0, case = 1
Setting direction: controls < cases
Setting levels: control = 0, case = 1
Setting direction: controls < cases
Setting levels: control = 0, case = 1
Setting direction: controls < cases
Setting levels: control = 0, case = 1
Setting direction: controls < cases
Setting levels: control = 0, case = 1
Setting direction: controls < cases
Warning messages:
1: from glmnet Fortran code (error code -99); Convergence for 99th lambda value not reached after maxit=100000 iterations; solutions for larger lambdas returned 
2: from glmnet Fortran code (error code -92); Convergence for 92th lambda value not reached after maxit=100000 iterations; solutions for larger lambdas returned 
3: from glmnet Fortran code (error code -90); Convergence for 90th lambda value not reached after maxit=100000 iterations; solutions for larger lambdas returned 
4: from glmnet Fortran code (error code -92); Convergence for 92th lambda value not reached after maxit=100000 iterations; solutions for larger lambdas returned 
5: from glmnet Fortran code (error code -92); Convergence for 92th lambda value not reached after maxit=100000 iterations; solutions for larger lambdas returned 
6: from glmnet Fortran code (error code -90); Convergence for 90th lambda value not reached after maxit=100000 iterations; solutions for larger lambdas returned 
7: from glmnet Fortran code (error code -92); Convergence for 92th lambda value not reached after maxit=100000 iterations; solutions for larger lambdas returned 
8: from glmnet Fortran code (error code -92); Convergence for 92th lambda value not reached after maxit=100000 iterations; solutions for larger lambdas returned 
9: from glmnet Fortran code (error code -95); Convergence for 95th lambda value not reached after maxit=100000 iterations; solutions for larger lambdas returned 
10: from glmnet Fortran code (error code -92); Convergence for 92th lambda value not reached after maxit=100000 iterations; solutions for larger lambdas returned 

easyml documentation built on June 26, 2017, 9:02 a.m.