knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.path = "man/figures/README-", out.width = "100%" )
The ffcr package constructs two transparent classification models: fast-and-frugal trees and tallying models. The book Classification in the Wild: The Science and Art of Transparent Decision Making. (Katiskopoulos et al., 2020) describes these models, their applications and the algorithms to construct them in detail.
A fast-and-frugal tree is a decision tree with a simple structure: one branch of each node exits tree, the other continues to the next node until the final node is reached. A tallying model gives pieces of evidence the same weight. This package implements several methods for training these two models, ranging from simple heuristics to computationally more complex cross-entropy optimization (Rubinstein, 1999).
The package can be installed with the following command:
# install.packages("devtools") devtools::install_github("marcusbuckmann/ffcr")
Alternatively, Windows users can download the binary file ffcr_1.0.zip and install it from the hard drive.
We start by training a fast-and-frugal tree on a medical data set, predicting whether patients have liver disease (Ramana, Babu, and Venkateswarlu, 2011; Dua and Graff, 2017). The data set is included in the package.
library(ffcr) model <- fftree(diagnosis~., data = liver, use_features_once = FALSE, method = "greedy", max_depth = 4)
The structure of the tree and its performance is shown by printing the model.
print(model)
To visualize the tree we use
plot(model)
knitr::include_graphics("man/figures/tree.png")
To make predictions according to the fast-and-frugal tree, we can use the predict
function.
pred <- predict(model, newdata = liver[1:10,]) pred
To train tallying models, we use the tally function. To make predictions we use the same command as for the fast-and-frugal trees.
model <- tally(diagnosis~., data = liver, max_size = 4) print(model) pred <- predict(model, newdata = liver[1:10,])
Please consult the vignette and the documentation of the package for more details on how to use it. The book Classification in the Wild provides background information on fast-and-frugal trees and tallying models.
Dua, Dheeru, and Casey Graff. 2017. “UCI Machine Learning Repository.” University of California, Irvine, School of Information; Computer Sciences. http://archive.ics.uci.edu/ml.
Katikopoulos, Konstantinos V., Özgür Şimşek, Marcus Buckmann, and Gerd Gigerenzer. 2020. Classification in the Wild: The Science and Art of Transparent Decision Making. MIT Press.
Ramana, Bendi Venkata, M Surendra Prasad Babu, and N. B. Venkateswarlu. 2011. “A Critical Study of Selected Classification Algorithms for Liver Disease Diagnosis.” International Journal of Database Management Systems 3 (2): 101–114.
Rubinstein, Reuven. 1999. “The Cross-Entropy Method for Combinatorial and Continuous Optimization.” Methodology and Computing in Applied Probability 1 (2): 127–190.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.