trace_plot | R Documentation |
Trace plots are useful tools for visually comparing trees from a random forest. This functions creates a trace plot given a set of trees from a random forest fit using the randomForest package. For more information on trace plots, see \insertCiteurbanek:2008;textualTreeTracer.
trace_plot(
rf,
train,
tree_ids,
width = 0.8,
alpha = 0.5,
tree_color = "black",
color_by_id = FALSE,
facet_by_id = FALSE,
id_order = NULL,
split_var_order = "rf_vi",
cont_var = NULL,
nrow = NULL,
max_depth = NULL,
rep_tree = NULL,
rep_tree_size = 1,
rep_tree_color = "blue",
rep_tree_alpha = 1
)
rf |
random forest model fit using randomForest |
train |
features used to train the random forest which the tree is from |
tree_ids |
vector of numbers specifying the trees to include in the trace plot |
width |
specifies the width of the horizontal feature lines in a trace plot (a number between 0 and 1; default is 0.8) |
alpha |
alpha value for the lines in the trace plot (a number between 0 and 1; default is 0.5) |
tree_color |
color of the traces (default is "black") |
color_by_id |
should the trace lines be colored by the tree IDs? (default if FALSE) |
facet_by_id |
should the traces be faceted by tree IDs? (default if FALSE) |
id_order |
order trees should be arranged by if facet_by_id is TRUE (optional) |
split_var_order |
order of the split variables on the x-axis (left to right) specified either manually as a vector of variable names or as "rf_vi" to indicate that the variables should be ordered by random forest variable importance (default is "rf_vi") |
cont_var |
continuous variable associated with the trees which can be used to color them (must be in the same order as tree_ids) (optional) |
nrow |
number of rows if facet_by_id is TRUE (otherwise ignored) |
max_depth |
the deepest depth to include in the trace plot (set to NULl by default) |
rep_tree |
option to add a "representative tree" on top of the trace plot by providing a data frame with the structure of the get_tree_data function (NULL by default) |
rep_tree_size |
line size of "representative tree" (1 by default) |
rep_tree_color |
line color of "representative tree" ("blue" by default) |
rep_tree_alpha |
line alpha of "representative tree" (1 by default) |
urbanek:2008TreeTracer
# Load packages
library(dplyr)
library(palmerpenguins)
# Load the Palmer penguins data
penguins <- na.omit(penguins)
# Fit a random forest
set.seed(71)
penguin_rf <-
randomForest::randomForest(
species ~ bill_length_mm + bill_depth_mm + flipper_length_mm + body_mass_g,
data = penguins
)
# Generate a trace plot of the first 10 trees in the forest
trace_plot(
rf = penguin_rf,
train = penguins %>% select(bill_length_mm, bill_depth_mm, flipper_length_mm, body_mass_g),
tree_ids = 1:10
)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.