ml_list: A wrapper on ml_tune function to train hundreds of machine...

Description Usage Arguments Details Value See Also Examples

Description

Auto-tune ml model with different sampling methods, different metrics, preprocessing method, number of cores and etc, and return many models in a list.

Usage

1
2
ml_list(data, target, params, summaryFunction = twoClassSummary,
  save_model = NULL)

Arguments

data

the data to be trained in dataframe format.

target

A character, column name of the target variable.

params

A dataframe, each column contains different information for model training.

summaryFunction

A function name. Use twoClassSummary for binary classification and multiClassSummary for multi-class classification.

save_model

A character or NULL, if NULL no models will be saved to disk, if given a character(the folder name), all models will be saved to a folder.

Details

params is the a dataframe used to store all the variables for all parameters in ml_tune that are not in ml_list. If you give save_model="fold_name" instead of NULL, evey ml model will be saved to that folder (it will create a folder if it does not exist) plus a final rds file containing every model with return with the name "folder_name.rds" if the program execute successfully. The goal of save_model option is to save the models to disk in case anything unexpected happened.

Value

a list contains the all the models. Each element in the list has same structure as train function in caret package would return.

See Also

To test why one algorithm does not work or to fine-tune a specific model, try function ml_tune or use caret's train function train.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
params_grid=expand.grid(sampling=c("up","down","rose","smote","ADAS")
                        ,metric=c("ROC","Accuracy","Kappa","Sens","Spec")
                        ,preProcess=list(c("zv","nzv","center","scale"),c("center","scale"))
                        ,method=c("glmnet","glm","bayesglm")
                        ,search="random"
                        ,tuneLength=10
                        ,k=10,nthread=3)

iris_list= ml_list(data=iris,target = "Species"
                   ,params = params_grid,summaryFunction=multiClassSummary
                   ,save_model="iris_models")

edwardcooper/automl documentation built on June 3, 2019, 1:05 a.m.