| FFTrees | R Documentation |
FFTrees is the workhorse function of the FFTrees package for creating fast-and-frugal trees (FFTs).
FFTs are decision algorithms for solving binary classification tasks, i.e., they predict the values of a binary criterion variable based on 1 or multiple predictor variables (cues).
Using FFTrees on data usually generates a range of FFTs and corresponding summary statistics (as an FFTrees object)
that can then be printed, plotted, and examined further.
The criterion and predictor variables are specified in formula notation.
Based on the settings of data and data.test, FFTs are trained on a (required) training dataset
(given the set of current goal values) and evaluated on (or predict) an (optional) test dataset.
If an existing FFTrees object object or tree.definitions are provided as inputs,
no new FFTs are created.
When both arguments are provided, tree.definitions take priority over the FFTs in an existing object.
Specifically,
If tree.definitions are provided, these are assigned to the FFTs of x.
If no tree.definitions are provided, but an existing FFTrees object object is provided,
the trees from object are assigned to the FFTs of x.
Create and evaluate fast-and-frugal trees (FFTs).
FFTrees(
formula = NULL,
data = NULL,
data.test = NULL,
algorithm = "ifan",
train.p = 1,
goal = NULL,
goal.chase = NULL,
goal.threshold = NULL,
max.levels = NULL,
numthresh.method = "o",
numthresh.n = 10,
repeat.cues = TRUE,
stopping.rule = "exemplars",
stopping.par = 0.1,
sens.w = 0.5,
cost.outcomes = NULL,
cost.cues = NULL,
main = NULL,
decision.labels = c("False", "True"),
my.goal = NULL,
my.goal.fun = NULL,
my.tree = NULL,
object = NULL,
tree.definitions = NULL,
do.comp = TRUE,
do.cart = TRUE,
do.lr = TRUE,
do.rf = TRUE,
do.svm = TRUE,
quiet = list(ini = TRUE, fin = FALSE, mis = FALSE, set = TRUE),
comp = NULL,
force = NULL,
rank.method = NULL,
rounding = NULL,
store.data = NULL,
verbose = NULL
)
formula |
A formula. A |
data |
A data frame. A dataset used for training (fitting) FFTs and alternative algorithms.
|
data.test |
A data frame. An optional dataset used for model testing (prediction) with the same structure as data. |
algorithm |
A character string. The algorithm used to create FFTs. Can be |
train.p |
numeric. What percentage of the data to use for training when |
goal |
A character string indicating the statistic to maximize when selecting trees:
|
goal.chase |
A character string indicating the statistic to maximize when constructing trees:
|
goal.threshold |
A character string indicating the criterion to maximize when optimizing cue thresholds:
|
max.levels |
integer. The maximum number of nodes (or levels) considered for an FFT.
As all combinations of possible exit structures are considered, larger values of |
numthresh.method |
How should thresholds for numeric cues be determined (as character)?
|
numthresh.n |
The number of numeric thresholds to try (as integer).
Default: |
repeat.cues |
May cues occur multiple times within a tree (as logical)?
Default: |
stopping.rule |
A character string indicating the method to stop growing trees. Available options are:
All stopping methods use |
stopping.par |
numeric. A numeric parameter indicating the criterion value for the current |
sens.w |
A numeric value from |
cost.outcomes |
A list of length 4 specifying the cost value for one of the 4 possible classification outcomes.
The list elements must be named |
cost.cues |
A list containing the cost of each cue (in some common unit).
Each list element must have a name corresponding to a cue (i.e., a variable in |
main |
string. An optional label for the dataset. Passed on to other functions, like |
decision.labels |
A vector of strings of length 2 for the text labels for negative and positive decision/prediction outcomes
(i.e., left vs. right, noise vs. signal, 0 vs. 1, respectively, as character).
E.g.; |
my.goal |
The name of an optimization measure defined by |
my.goal.fun |
The definition of an outcome measure to optimize, defined as a function
of the frequency counts of the 4 basic classification outcomes |
my.tree |
A verbal description of an FFT, i.e., an "FFT in words" (as character string).
For example, |
object |
An optional existing |
tree.definitions |
An optional |
do.comp, do.lr, do.cart, do.svm, do.rf |
Should alternative algorithms be used for comparison (as logical)?
All options are set to
Specifying |
quiet |
A list of 4 logical arguments: Should detailed progress reports be suppressed?
Setting list elements to |
comp, force, rank.method, rounding, store.data, verbose |
Deprecated arguments (unused or replaced, to be retired in future releases). |
An FFTrees object with the following elements:
The name of the binary criterion variable (as character).
The names of all potential predictor variables (cues) in the data (as character).
The formula specified when creating the FFTs.
A list of FFTs created, with further details contained in n, best, definitions, inwords, stats, level_stats, and decisions.
The original training and test data (if available).
A list of defined control parameters (e.g.; algorithm, goal, sens.w, as well as various thresholds, stopping rule, and cost parameters).
Models and classification statistics for competitive classification algorithms:
Logistic regression (lr), classification and regression trees (cart), random forests (rf), and support vector machines (svm).
A list of cue information, with further details contained in thresholds and stats.
print.FFTrees for printing FFTs;
plot.FFTrees for plotting FFTs;
summary.FFTrees for summarizing FFTs;
inwords for obtaining a verbal description of FFTs;
showcues for plotting cue accuracies.
# 1. Create fast-and-frugal trees (FFTs) for heart disease:
heart.fft <- FFTrees(formula = diagnosis ~ .,
data = heart.train,
data.test = heart.test,
main = "Heart Disease",
decision.labels = c("Healthy", "Diseased")
)
# 2. Print a summary of the result:
heart.fft # same as:
# print(heart.fft, data = "train", tree = "best.train")
# 3. Plot an FFT applied to training data:
plot(heart.fft) # same as:
# plot(heart.fft, what = "all", data = "train", tree = "best.train")
# 4. Apply FFT to (new) testing data:
plot(heart.fft, data = "test") # predict for Tree 1
plot(heart.fft, data = "test", tree = 2) # predict for Tree 2
# 5. Predict classes and probabilities for new data:
predict(heart.fft, newdata = heartdisease)
predict(heart.fft, newdata = heartdisease, type = "prob")
# 6. Create a custom tree (from verbal description) with my.tree:
custom.fft <- FFTrees(
formula = diagnosis ~ .,
data = heartdisease,
my.tree = "If age < 50, predict False.
If sex = 1, predict True.
If chol > 300, predict True, otherwise predict False.",
main = "My custom FFT")
# Plot the (pretty bad) custom tree:
plot(custom.fft)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.