rf.post.hoc: RF Post-Hoc Test

Description Usage Arguments Details Value Author(s) References See Also Examples

View source: R/rf.post.hoc.R

Description

Test for false positives in RF prediction.

Usage

1
rf.post.hoc(rf, newdata)

Arguments

rf

Object of class randomForest.

newdata

Dataframe to predict using RF model.

Details

RF Post-Hoc test: Take an object created with randomForest and a new dataset to predict. From the votes matrix of RF object take only correct assigments and calculate for each predicted class the beta distribution of the probability of assignment (POA) to the correct class using fitdistr() from MASS. In case fitdistr fails in estimating the parameters of beta, the non-parametric empirical cumulative distribution function (ecdf) is used instead. The for each observation in newdata, the probability that the POA to winner class (p.assign) belongs to the beta (or ecdf) of that class in the training data set is calculated (p.post.hoc). Use p.post.hoc to reject the hypothesis that the predicted POA belongs to the trained POA. A low p.post.hoc probability (e.g. <= 0.05) indicates a possible missclassification of that observation.

Methods plot, print and summary are available

Value

An object of class RF.ph List with results of Post-Hoc test .

Author(s)

Pedro Martinez Arbizu & Sven Rossel

References

Rossel, S. & P. Martinez Arbizu (2018) Automatic specimen identification of Harpacticoids (Crustacea:Copepoda) using Random Forest and MALDIā€TOF mass spectra, including a post hoc test for false positive discovery. Methods in Ecology and Evolution, 9(6):1421-1434.

https://doi.org/10.1111/2041-210X.13000

See Also

add.null.class smooth.data robust.test plot.RFPH

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
#example with maldi data
data(maldi)
library(randomForest)
unique(maldi$species)
maldi_train <- maldi[maldi$species != 'Cletodes limicola',]
maldi_test <- maldi[maldi$species == 'Cletodes limicola',]
#exclude Cletodes limicola from factors
maldi_train$species <- factor(maldi_train$species)
rf <- randomForest(species ~ ., data = maldi_train[-1])
ph <- rf.post.hoc(rf,maldi_test)
plot(ph)
plot(ph,'Tachidius discipes')

pmartinezarbizu/RFtools documentation built on March 10, 2021, 12:11 p.m.