llm.cv: Runs v-fold cross validation with LLM

Description Usage Arguments Value Author(s) References See Also Examples

Description

In v-fold cross validation, the data are divided into v subsets of approximately equal size. Subsequently, one of the v data parts is excluded while the remaider of the data is used to create a logitleafmodel object. Predictions are generated for the excluded data part. The process is repeated v times.

Usage

1
llm.cv(X, Y, cv, threshold_pruning = 0.25, nbr_obs_leaf = 100)

Arguments

X

Dataframe containing numerical independent variables.

Y

Numerical vector of dependent variable. Currently only binary classification is supported.

cv

An integer specifying the number of folds in the cross-validation.

threshold_pruning

Set confidence threshold for pruning. Default 0.25.

nbr_obs_leaf

The minimum number of observations in a leaf node. Default 100.

Value

An object of class llm.cv, which is a list with the following components:

foldpred

a data frame with, per fold, predicted class membership probabilities for the left-out observations

pred

a data frame with predicted class membership probabilities.

foldclass

a data frame with, per fold, predicted classes for the left-out observations.

class

a data frame with the predicted classes.

conf

the confusion matrix which compares the real versus the predicted class memberships based on the class object.

Author(s)

Arno De Caigny, a.de-caigny@ieseg.fr, Kristof Coussement, k.coussement@ieseg.fr and Koen W. De Bock, kdebock@audencia.com

References

Arno De Caigny, Kristof Coussement, Koen W. De Bock, A New Hybrid Classification Algorithm for Customer Churn Prediction Based on Logistic Regression and Decision Trees, European Journal of Operational Research (2018), doi: 10.1016/j.ejor.2018.02.009.

See Also

predict.llm, table.llm.html, llm

Examples

1
2
3
4
5
6
7
8
## Load PimaIndiansDiabetes dataset from mlbench package
if (requireNamespace("mlbench", quietly = TRUE)) {
  library("mlbench")
}
data("PimaIndiansDiabetes")
## Create the LLM with 5-cv
Pima.llm <- llm.cv(X = PimaIndiansDiabetes[,-c(9)],Y = PimaIndiansDiabetes$diabetes, cv=5,
 threshold_pruning = 0.25,nbr_obs_leaf = 100)

LLM documentation built on July 1, 2020, 7:19 p.m.

Related to llm.cv in LLM...