fftree: Fitting fast-and-frugal trees

fftreeR Documentation

Fitting fast-and-frugal trees

Description

fftree is used to fit fast-and-frugal trees.

Usage

fftree(
  data,
  formula = stats::as.formula(data),
  method = "greedy",
  max_depth = 6,
  split_function = "gini",
  weights = c(1, 1),
  pruning = FALSE,
  cv = FALSE,
  use_features_once = TRUE,
  cross_entropy_parameters = cross_entropy_control()
)

## S4 method for signature 'data.frame'
fftree(
  data,
  formula = stats::as.formula(data.frame(data)),
  method = "greedy",
  max_depth = 6,
  split_function = "gini",
  weights = c(1, 1),
  pruning = FALSE,
  cv = FALSE,
  use_features_once = TRUE,
  cross_entropy_parameters = cross_entropy_control()
)

## S4 method for signature 'matrix'
fftree(
  data,
  formula = stats::as.formula(data.frame(data)),
  method = "greedy",
  max_depth = 6,
  split_function = "gini",
  weights = c(1, 1),
  pruning = FALSE,
  cv = FALSE,
  use_features_once = TRUE,
  cross_entropy_parameters = cross_entropy_control()
)

Arguments

data

An object of class data.frame or matrix. The response variable can either be a factor with two levels or an integer vector with values 0,1.

formula

formula (optional). If formula is not provided, the first column of the data argument is used as the response variable and all other columns as predictors.

method

Type of induction method for the fast-and-frugal tree:

  • greedy (default and recommended)

  • basic

  • cross-entropy

max_depth

Maximum number of nodes of the fast-and-frugal tree (default: 6).

split_function

Function should be used to determine the splitting values on numeric features. This only applies to fast-and-frugal trees trained with the 'basic' or 'greedy' method. By default Gini entropy ('gini') is used. Other options are Shannon entropy ('entropy') and 'median'.

weights

A numeric vector of length 2 (default: c(1,1)) with weights assigned to instances in the two classes. The vector entries should be named by the class labels. If they are not, the first entry refers to the negative class, the second entry to the positive class. (see examples).

pruning

If the argument is set to TRUE the tree is pruned using cross-validation. This can increase the training time substantially and is not recommended when using the computationally costly 'cross-entropy' method. By default, pruning is not used.

cv

If TRUE 10-fold cross validation is used to estimate the predictive performance of the model. By default, cross-validation is not used.

use_features_once

If TRUE an attribute is used only once in a tree. If FALSE, a feature may be split several times. Note that, by construction, the basic method can only use each feature once. The default value is TRUE.

cross_entropy_parameters

Hyperparameters for the cross-entropy method. By default the output of the function cross_entropy_control is passed.

Value

A fftreeModel object.

Examples

data(liver)
model <- fftree(data = liver, formula = diagnosis~., method = "greedy")
plot(model)
model

# weight instances by the inverse of the prior
# in this way both classes contribute equally when training the model
prior <- mean(ifelse(liver$diagnosis == "Liver disease", 1, 0))
weights <- c("No liver disease" = prior, "Liver disease" = 1-prior)
mod <- fftree(data = liver, formula = diagnosis~., weights = weights, method = "greedy")


marcusbuckmann/ffcr documentation built on Jan. 4, 2024, 3:45 p.m.