Description Usage Arguments Value Note Author(s) References Examples
View source: R/probabily.calibration.R
Performs an isotonic regression calibration of posterior probability to minimize log loss.
1 | probability.calibration(y, p, regularization = FALSE)
|
y |
Binomial response variable used to fit model |
p |
Estimated probabilities from fit model |
regularization |
(FALSE/TRUE) should regularization be performed on the probabilities? (see notes) |
a vector of calibrated probabilities
Isotonic calibration can correct for monotonic distortions.
regularization defines new minimum and maximum bound for the probabilities using:
pmax = ( n1 + 1) / (n1 + 2), pmin = 1 / ( n0 + 2); where n1 = number of prevalence values and n0 = number of null values
Jeffrey S. Evans <jeffrey_evans<at>tnc.org>
Platt, J. (1999) Probabilistic outputs for support vector machines and comparison to regularized likelihood methods. Advances in Large Margin Classifiers (pp 61-74).
Niculescu-Mizil, A., & R. Caruana (2005) Obtaining calibrated probabilities from boosting. Proc. 21th Conference on Uncertainty in Artificial Intelligence (UAI 2005). AUAI Press.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 | library(randomForest)
data(iris)
iris$Species <- ifelse( iris$Species == "versicolor", 1, 0 )
# Add some noise
idx1 <- which(iris$Species %in% 1)
idx0 <- which( iris$Species %in% 0)
iris$Species[sample(idx1, 2)] <- 0
iris$Species[sample(idx0, 2)] <- 1
# Specify model
y = iris[,"Species"]
x = iris[,1:4]
set.seed(4364)
( rf.mdl <- randomForest(x=x, y=factor(y)) )
y.hat <- predict(rf.mdl, iris[,1:4], type="prob")[,2]
# Calibrate probabilities
calibrated.y.hat <- probability.calibration(y, y.hat, regularization = TRUE)
# Plot calibrated against original probability estimate
plot(density(y.hat), col="red", xlim=c(0,1), ylab="Density", xlab="probabilities",
main="Calibrated probabilities" )
lines(density(calibrated.y.hat), col="blue")
legend("topright", legend=c("original","calibrated"),
lty = c(1,1), col=c("red","blue"))
|
randomForest 4.6-14
Type rfNews() to see new features/changes/bug fixes.
Call:
randomForest(x = x, y = factor(y))
Type of random forest: classification
Number of trees: 500
No. of variables tried at each split: 2
OOB estimate of error rate: 6.67%
Confusion matrix:
0 1 class.error
0 95 5 0.05
1 5 45 0.10
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.