run_models: Fit binary classification model

View source: R/modeling.R

run_modelsR Documentation

Fit binary classification model

Description

Fit some representative binary classification models.

Usage

run_models(
  .data,
  target,
  positive,
  models = c("logistic", "rpart", "ctree", "randomForest", "ranger", "xgboost", "lasso")
)

Arguments

.data

A train_df. Train data to fit the model. It also supports tbl_df, tbl, and data.frame objects.

target

character. Name of target variable.

positive

character. Level of positive class of binary classification.

models

character. Algorithm types of model to fit. See details. default value is c("logistic", "rpart", "ctree", "randomForest", "ranger", "lasso").

Details

Supported models are functions supported by the representative model package used in R environment. The following binary classifications are supported:

  • "logistic" : logistic regression by glm() in stats package.

  • "rpart" : recursive partitioning tree model by rpart() in rpart package.

  • "ctree" : conditional inference tree model by ctree() in party package.

  • "randomForest" : random forest model by randomForest() in randomForest package.

  • "ranger" : random forest model by ranger() in ranger package.

  • "xgboost" : XGBoosting model by xgboost() in xgboost package.

  • "lasso" : lasso model by glmnet() in glmnet package.

run_models() executes the process in parallel when fitting the model. However, it is not supported in MS-Windows operating system and RStudio environment.

Value

model_df. results of fitted model. model_df is composed of tbl_df and contains the following variables.:

  • step : character. The current stage in the model fit process. The result of calling run_models() is returned as "1.Fitted".

  • model_id : character. Type of fit model.

  • target : character. Name of target variable.

  • is_factor : logical. Indicates whether the target variable is a factor.

  • positive : character. Level of positive class of binary classification.

  • negative : character. Level of negative class of binary classification.

  • fitted_model : list. Fitted model object.

Examples

library(dplyr)

# Divide the train data set and the test data set.
sb <- rpart::kyphosis %>%
  split_by(Kyphosis)

# Extract the train data set from original data set.
train <- sb %>%
  extract_set(set = "train")

# Extract the test data set from original data set.
test <- sb %>%
  extract_set(set = "test")

# Sampling for unbalanced data set using SMOTE(synthetic minority over-sampling technique).
train <- sb %>%
  sampling_target(seed = 1234L, method = "ubSMOTE")

# Cleaning the set.
train <- train %>%
  cleanse

# Run the model fitting.
result <- run_models(.data = train, target = "Kyphosis", positive = "present")
result

# Run the several kinds model fitting by dplyr
train %>%
  run_models(target = "Kyphosis", positive = "present")


alookr documentation built on May 29, 2024, 10:38 a.m.