rf.significance: Random Forest model significance test

View source: R/rf.significance.R

rf.significanceR Documentation

Random Forest model significance test

Description

Performs significance test for classification and regression Random Forests models.

Usage

rf.significance(x, nperm = 999, randomization = 1, kappa = FALSE)

Arguments

x

randomForest class object

nperm

Number of permutations

randomization

Fraction (0.01-1) of randomization, default is 1 (total randomization)

kappa

(FALSE/TRUE) In classification, use kappa rather than percent correctly classified

Details

If the p-value is small, it suggests a near certainty that the difference between the two populations is significant. alternative = c("two.sided", "less", "greater")

Value

A list class object with the following components: For Regression problems:

  • RandR.square Vector of random R-square values

  • R.square The R-square of the "true" model

  • p.value p-values of randomizations of R-square

  • ks.p.value p-value(s) evaluation of Kolmogorov-Smirnov test

  • nPerm number of permutations

  • rf.type Type of Random Forests

  • rand.frac Amortization fraction

For Classification problems:

  • RandOOB Vector of random out-of-bag (OOB) values

  • RandMaxError Maximum error of randomizations

  • test.OOB Error OOB error of the "true" model

  • test.MaxError maximum class OOB error of the "true" model

  • p.value p-value based on Mcnemar's test

  • oop.p.value p-value based on permutation of OOB error

  • nPerm Number of permutations

  • rf.type Type of Random Forests

  • rand.frac Amortization fraction

Note

Please note that previous versions of this function required xdata and "..." arguments that are no longer necessary. The model object is now used in obtaining the data and arguments used in the original model

Author(s)

Jeffrey S. Evans jeffrey_evans@tnc.org

References

Murphy M.A., J.S. Evans, and A.S. Storfer (2010) Quantify Bufo boreas connectivity in Yellowstone National Park with landscape genetics. Ecology 91:252-261

Evans J.S., M.A. Murphy, Z.A. Holden, S.A. Cushman (2011). Modeling species distribution and change using Random Forests CH.8 in PredictiveModeling in Landscape Ecology eds Drew, CA, Huettmann F, Wiersma Y. Springer

Examples

## Not run: 
#### Regression
library(randomForest)
library(ranger)

  set.seed(1234)	
  data(airquality)
  airquality <- na.omit(airquality)
 
 # randomForest 
 ( rf.mdl <- randomForest(x=airquality[,2:6], y=airquality[,1]) )
   ( rf.perm <- rf.significance(rf.mdl, nperm=99) )

 # ranger
 ( rf.mdl <- ranger(x=airquality[,2:6], y=airquality[,1]) )
   ( rf.perm <- rf.significance(rf.mdl, nperm=99) )

 
#### Classification
ydata = as.factor(ifelse(airquality[,1] < 40, 0, 1))
( rf.mdl <- ranger(x = airquality[,2:6], y = ydata) )
      ( rf.perm <- rf.significance(rf.mdl, nperm=99) )
   
( rf.mdl <- randomForest(x = airquality[,2:6], y = ydata) )
     ( rf.perm <- rf.significance(rf.mdl, nperm=99) )


  set.seed(1234)	
    data(iris)
      iris$Species <- as.factor(iris$Species) 

 	
 ( rf.mdl <- randomForest(x=iris[,1:4], y=iris[,"Species"]) )
   ( rf.perm <- rf.significance(rf.mdl, nperm=99) )

 ( rf.mdl <- ranger(x=iris[,1:4], y=iris[,"Species"]) )
   ( rf.perm <- rf.significance(rf.mdl, nperm=99) )


## End(Not run)


jeffreyevans/rfUtilities documentation built on Nov. 12, 2023, 6:52 p.m.