Makes handling output from decision trees easy. Treezy.
Decision trees are a commonly used tool in statistics and data science, but sometimes getting the information out of them can be a bit tricky, and can make other operations in a pipeline difficult.
treezy
makes it easy to:
The data structures created in treezy
- importance_table
are making their way over to the broomstick
package - a member of the broom family specifically focussing on decision trees, which gives different output to many of the (many!) packages/analyses that broom deals with.
I am interested in feedback, so please feel free to file an issue if you have any problems!
knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.path = "man/figures/README-" )
# install.packages("remotes") remotes::install_github("njtierney/treezy")
importance_table
and importance_plot
library(treezy) library(rpart) fit_rpart_kyp <- rpart(Kyphosis ~ ., data = kyphosis)
# default method for looking at importance # variable importance fit_rpart_kyp$variable.importance # with treezy importance_table(fit_rpart_kyp) importance_plot(fit_rpart_kyp) # extend and modify library(ggplot2) importance_plot(fit_rpart_kyp) + theme_bw() + labs(title = "My Importance Scores", subtitle = "For a CART Model")
library(randomForest) set.seed(131) fit_rf_ozone <- randomForest(Ozone ~ ., data = airquality, mtry=3, importance=TRUE, na.action=na.omit) fit_rf_ozone ## Show "importance" of variables: higher value mean more important: # randomForest has a better importance method than rpart importance(fit_rf_ozone) ## use importance_table importance_table(fit_rf_ozone) # now plot it importance_plot(fit_rf_ozone)
# CART rss(fit_rpart_kyp) # randomForest rss(fit_rf_ozone)
# using gbm.step from the dismo package library(gbm) library(dismo) # load data data(Anguilla_train) anguilla_train <- Anguilla_train[1:200,] # fit model angaus_tc_5_lr_01 <- gbm.step(data = anguilla_train, gbm.x = 3:14, gbm.y = 2, family = "bernoulli", tree.complexity = 5, learning.rate = 0.01, bag.fraction = 0.5)
gg_partial_plot(angaus_tc_5_lr_01, var = c("SegSumT", "SegTSeas"))
gbm
, tree
, ranger
, xgboost
, and more)broom
's augment
, tidy
, and glance
functions. For example, rpart_fit$splits
Credit for the name, "treezy", goes to @MilesMcBain, thanks Miles!
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.