bscontrol: Define Parameters for the 'bsnsing' Fit

View source: R/bsnsing.R

bscontrolR Documentation

Define Parameters for the bsnsing Fit

Description

Define Parameters for the bsnsing Fit

Usage

bscontrol(
  bin.size = 5,
  nseg.numeric = 20,
  nseg.factor = 20,
  num2factor = 10,
  node.size = 0,
  stop.prob = 0.9999,
  opt.solver = c("enum_c", "enum", "greedy", "hybrid", "gurobi", "lpSolve", "cplex"),
  solver.timelimit = 180,
  max.rules = 2,
  opt.model = c("gini", "error"),
  greedy.level = 0.9,
  import.external = T,
  suppress.internal = F,
  no.same.gender.children = F,
  n0n1.cap = 40000,
  verbose = F
)

Arguments

bin.size

the minimum number of observations required in a binarization bucket.

nseg.numeric

the maximum number of segments the range of a numeric variable is divided into for each inequality direction.

nseg.factor

the maximum number of unique levels allowed in a factor variable.

num2factor

an equality binarization rule will be created for each unique value of a numeric variable (in addition to the inequality binarization attempt), if the number of unique values of the numeric variable is less than num2factor.

node.size

if the number of training cases falling into a tree node is fewer than node.size, the node will become a leaf and no further split will be attempted on it; in addition, do not split a node if either child node that would result from the split contains fewer than node.size observation. Default is 0, which indicates that the node.size will be set automatically according to this formula: floor(sqrt(Number of training cases)).

stop.prob

if the proportion of the majority class in a tree node is greater than stop.prob, the node will become a leaf and no further split will be attempted on it.

opt.solver

a character string in the set 'enum', 'enum_c', 'gurobi', 'cplex', 'lpSolve', 'greedy' indicating the optimization solver to be used in the program. The choice of 'cplex' requires the package cplexAPI, 'gurobi' requires the package gurobi, 'lpSolve' requires the package lpSolve and 'enum_c' requires the .dll or .dylib file. The default is 'greedy' because it is fast and does not rely on other packages. The 'enum' algorithm is the implicit enumeration method which guarantees to find the optimal solution, typically faster than an optimization solver. It is a tradeoff between the greedy heuristic and the mathematical optimization methods.

solver.timelimit

the solver time limit in seconds. Currently only applicable to 'gurobi', 'enum' and 'enum_c' solvers.

max.rules

the maximum number of features allowed to enter an OR-clause split rule. A small max.rules reduces the search space and regulates model complexity. Default is 3.

opt.model

a character string in the set 'gini','error' indicating the optimization model to solve in the program. The default is 'gini'. The choice of 'error' is faster because the optimization model is smaller. The default is 'gini'.

greedy.level

a proportion value between 0 and 1, applicable only when opt.solver is 'greedy'. In the greedy forward selection process of split rules, a candidate rule is added to the OR-clause only if the split performance (gini reduction or accuracy) after the addition multiplied by greedy.level would still be greater than the split performance before the addition. A higher value of greedy.level tend to more aggressively produce multi-variable splits.

import.external

logical value indicating whether or not to try importing candidate split rules from other decision tree packages. Default is True.

suppress.internal

logical value indicating whether or not to suppress the feature binarization process that creates the pool of binary features. If it is set to True, then only the features imported from external methods (if import.external is True) will be used in the optimal rule selection model. Default is False.

no.same.gender.children

logical value indicating whether or not to suppress splits that would result in both children having the same majority class. Default is False.

n0n1.cap

a positive integer. It is applicable only when the opt.solver is 'hybrid' and the opt.model is 'gini'. When the bslearn function is called, if the product of the number of negative cases (n0) and the number of positive cases (n1) is greater than this number, 'enum' solver will be used; otherwise, gurobi solver will be used.

verbose

a logical value (TRUE or FALSE) indicating whether the solution details are to be printed on the screen.

Value

An object of class bscontrol.

Examples

bscontrol()  # display the default parameters
bsc <- bscontrol(stop.prob = 0.8, nseg.numeric = 10, verbose = TRUE)
bsc

profyliu/bsnsing documentation built on July 5, 2022, 8:10 a.m.