bsnsing-package | R Documentation |
The bsnsing package provides functions for building a decision tree classifier and making predictions. It solves a mixed-integer programming (MIP) model to maximize the Gini reduction at each node split, and each node split rule can utilize multiple input variables. Benchmarking experiments on 75 open data sets suggest that bsnsing trees are the most capable of discriminating new cases compared to trees trained by other decision tree codes including the rpart, C50, party and tree packages in R. Compared to other optimal decision tree packages, including DL8.5, OSDT, GOSDT and indirectly more, bsnsing stands out in its training speed, ease of use and broader applicability without losing in prediction accuracy. For more information, please check out the paper https://arxiv.org/abs/2205.15263, to be published in INFORMS Journal on Computing.
The default method for solving the MIP model is the implicit enumeration (ENUM) algorithm, while other solvers including GUROBI, CPLEX and lpSolve can be used (via specifying the opt.solver option in the bsnsing
function). However, the users are strongly suggested to compile the bslearn.c file, make it into a shared library (e.g., .dylib, .so or .dll binary file) and paste the binary file in the work directory. In this way, the bsnsing will leverage the compiled code (instead of the R code) for the ENUM algorithm, which runs much (~40x) faster. All benchmarking experiments were run using the compiled ENUM algorithm. The C source file and the MAKE file can be found at https://github.com/profyliu/bsnsing. Pre-compiled binary files for different target platforms are also provided there for the convenience of the users (just download the .dylib, .so or the .dll file, depending on the operating system, and put it in the work directory). Future updates of this package will internalize the compilation step, but for now only the R implementation of the ENUM algorithm is included in the package source, so serious users please take the extra step outlined above.
Several data frames (i.e., auto
, iris
, GlaucomaMVF
and BreastCancer
) used in the example code are included in this package. More two-class and multi-class classification data sets can be found at https://github.com/profyliu/bsnsing.
The learn (train) functions include bsnsing
, bsnsing.formula
and bsnsing.default
.
The predict functions include: predict.bsnsing
and predict.mbsnsing
.
A bsnsing
object (tree) can be plotted into a PDF file, or in the form of latex code, by the function show.bsnsing
. The ROC curve can be plotted using the function ROC_func
.
Here is a list of internal functions of the package that are open for end users.
summary.bsnsing
summary.mbsnsing
binarize
,
binarize.numeric
,
binarize.factor
,
binarize.y
,
bslearn
,
bscontrol
Yanchao Liu
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.