splitFeatures: Splitting features

splitFeaturesR Documentation

Splitting features

Description

splitFeatures is used internally by fftree and tally to find split points on numeric and categorical features.

Usage

splitFeatures(data, formula = stats::as.formula(data), ...)

## S4 method for signature 'data.frame'
splitFeatures(
  data,
  formula = as.formula(data.frame(data)),
  splits = "gini",
  weights = c(1, 1),
  ...
)

## S4 method for signature 'matrix'
splitFeatures(
  data,
  formula = stats::as.formula(data.frame(data)),
  splits = "gini",
  weights = c(1, 1),
  ...
)

Arguments

data

an object of class data.frame or matrix. The criterion can either be a factor with two levels or an integer (0,1). The positive class is the second factor level (levels(data$criterion)[2]), or 1 if the criterion is numeric.

formula

formula (optional). If formula is not provided, the first column of the data argument is used as the response variable and all other columns as predictors.

...

optional parameters passed to low level function

splits

specifies the method used to find a splitting point on numeric and binary features

  • gini (default)

  • entropy

  • median

weights

a numeric vector of length 2 (default: c(1,1)). The first entry specifies the weight of instances in the positive class, the second entry the weight of instances in the negative class.

Value

A splits object.

Examples

data(liver)
splits <- splitFeatures(data = liver, formula = diagnosis~., splits = "median")

marcusbuckmann/ffcr documentation built on Jan. 4, 2024, 3:45 p.m.