rf_tidiers: Tidying methods for a randomForest model
In njtierney/broomstick: Convert Decision Tree Objects into Tidy Data Frames

augment.randomForest

R Documentation

Tidying methods for a randomForest model

Description

These methods tidy the variable importance of a random forest model summary, augment the original data with information on the fitted values/classifications and error, and construct a one-row glance of the model's statistics.

Usage

## S3 method for class 'randomForest'
augment(x, data = NULL, ...)

## S3 method for class 'randomForest'
glance(x, ...)

## S3 method for class 'randomForest'
tidy(x, ...)

Arguments

`x`	randomForest object
`data`	Model data for use by `augment.randomForest()`.
`...`	Additional arguments (ignored)

Value

augment.randomForest returns the original data with additional columns:

`.oob_times`	The number of trees for which the given case was "out of bag". See `randomForest::randomForest()` for more details.
`.fitted`	The fitted value or class.

augment returns additional columns for classification and usupervised trees:

`.votes`	For each case, the voting results, with one column per class.
`.local_var_imp`	The casewise variable importance, stored as data frames in a nested list-column, with one row per variable in the model. Only present if the model was created with `importance = TRUE`

glance.randomForest returns a data.frame with the following columns for regression trees:

`mse`	The average mean squared error across all trees.
`rsq`	The average pesudo-R-squared across all trees. See `randomForest::randomForest()` for more information.

For classification trees: one row per class, with the following columns:

`precision`
`recall`
`accuracy`
`f_measure`

All tidying methods return a data.frame without rownames. The structure depends on the method chosen.

tidy.randomForest returns one row for each model term, with the following columns:

`term`	The term in the randomForest model
`MeanDecreaseAccuracy`	A measure of variable importance. See `randomForest::randomForest()` for more information. Only present if the model was created with `importance = TRUE`
`MeanDecreaseGini`	A measure of variable importance. See `randomForest::randomForest()` for more information.
`MeanDecreaseAccuracy_sd`	Standard deviation of `MeanDecreaseAccuracy`. See `randomForest::randomForest()` for more information. Only present if the model was created with `importance = TRUE`
`classwise_importance`	Classwise variable importance for each term, stored as data frames in a nested list-column, with one row per class. Only present if the model was created with `importance = TRUE`