Description Usage Arguments Details Value Author(s) References See Also Examples
Test for false positives in RF prediction.
1 | rf.post.hoc(rf, newdata)
|
rf |
Object of class randomForest. |
newdata |
Dataframe to predict using RF model. |
RF Post-Hoc test: Take an object created with randomForest and a new dataset to predict. From the votes matrix of RF object take only correct assigments and calculate for each predicted class the beta distribution of the probability of assignment (POA) to the correct class using fitdistr() from MASS. In case fitdistr fails in estimating the parameters of beta, the non-parametric empirical cumulative distribution function (ecdf) is used instead. The for each observation in newdata, the probability that the POA to winner class (p.assign) belongs to the beta (or ecdf) of that class in the training data set is calculated (p.post.hoc). Use p.post.hoc to reject the hypothesis that the predicted POA belongs to the trained POA. A low p.post.hoc probability (e.g. <= 0.05) indicates a possible missclassification of that observation.
Methods plot, print and summary are available
An object of class RF.ph List with results of Post-Hoc test .
ecdf The empirical cumulative distribution fuction of the beta distributed RF probability of assignment to correct class.
post.hoc Dataframe with results of post-hoc test.
winner The predicted class.
POA Probability of assigment to the winner class.
p.post.hoc Probability that POA belongs to POA distribution of this class in the trained model.
sig Significance code: 0 ***, 0.001 **, 0.01 *, 0.05 ., > ns
Pedro Martinez Arbizu & Sven Rossel
Rossel, S. & P. Martinez Arbizu (2018) Automatic specimen identification of Harpacticoids (Crustacea:Copepoda) using Random Forest and MALDIāTOF mass spectra, including a post hoc test for false positive discovery. Methods in Ecology and Evolution, 9(6):1421-1434.
https://doi.org/10.1111/2041-210X.13000
add.null.class
smooth.data
robust.test
plot.RFPH
1 2 3 4 5 6 7 8 9 10 11 12 | #example with maldi data
data(maldi)
library(randomForest)
unique(maldi$species)
maldi_train <- maldi[maldi$species != 'Cletodes limicola',]
maldi_test <- maldi[maldi$species == 'Cletodes limicola',]
#exclude Cletodes limicola from factors
maldi_train$species <- factor(maldi_train$species)
rf <- randomForest(species ~ ., data = maldi_train[-1])
ph <- rf.post.hoc(rf,maldi_test)
plot(ph)
plot(ph,'Tachidius discipes')
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.