rfPermute: Estimate Permutation p-values for Random Forest Importance...

View source: R/rfPermute.R

rfPermuteR Documentation

Estimate Permutation p-values for Random Forest Importance Metrics

Description

Estimate significance of importance metrics for a Random Forest model by permuting the response variable. Produces null distribution of importance metrics for each predictor variable and p-value of observed.

Usage

rfPermute(x, ...)

## Default S3 method:
rfPermute(x, y = NULL, ..., num.rep = 100, num.cores = 1)

## S3 method for class 'formula'
rfPermute(
  formula,
  data = NULL,
  ...,
  subset,
  na.action = na.fail,
  num.rep = 100,
  num.cores = 1
)

as.randomForest(x)

## S3 method for class 'rfPermute'
print(x, ...)

## S3 method for class 'rfPermute'
predict(object, ...)

Arguments

x, y, formula, data, subset, na.action, ...

See randomForest for definitions. In as.randomForest this is either a randomForest or rfPermute object to be converted to a randomForest object.

num.rep

Number of permutation replicates to run to construct null distribution and calculate p-values (default = 100).

num.cores

Number of CPUs to distribute permutation results over. Defaults to NULL which uses one fewer than the number of cores reported by detectCores.

object

an rfPermute model to be used for prediction. See predict.randomForest

Details

All other parameters are as defined in randomForest.formula. A Random Forest model is first created as normal to calculate the observed values of variable importance. The response variable is then permuted num.rep times, with a new Random Forest model built for each permutation step.

Value

An rfPermute object.

Author(s)

Eric Archer eric.archer@noaa.gov

Examples

# A regression model predicting ozone levels
data(airquality)
ozone.rp <- rfPermute(Ozone ~ ., data = airquality, na.action = na.omit, ntree = 100, num.rep = 50)
ozone.rp
  
# Plot the scaled importance distributions 
# Significant (p <= 0.05) predictors are in red
plotImportance(ozone.rp, scale = TRUE)

# Plot the importance null distributions and observed values for two of the predictors
plotNull(ozone.rp, preds = c("Solar.R", "Month"))


# A classification model classifying cars to manual or automatic transmission 
data(mtcars)

am.rp <- rfPermute(factor(am) ~ ., mtcars, ntree = 100, num.rep = 50)
summary(am.rp)


plotImportance(am.rp, scale = TRUE, sig.only = TRUE)




rfPermute documentation built on Aug. 24, 2023, 1:08 a.m.