outliers-methods: Compute Outlier Scores

Description Usage Arguments Value Methods References Examples

Description

Compute outlier scores for each class of examples used to train a random forest. Outliers are defined as examples whose proximities to other examples in the same class are small.

Usage

1
2
## S4 method for signature 'bigcforest'
outliers(forest, trace=0L)

Arguments

forest

A random forest of class "bigcforest".

trace

0 for no verbose output. 1 to print verbose output. Default: 0.

Value

A numeric vector containing the outlier scores for each training example. Higher scores indicate greater dissimilarity from other training examples in the same class.

Methods

signature(forest = "bigcforest")

Compute outlier scores for a classification random forest.

References

Breiman, L. (2001). Random forests. Machine learning, 45(1), 5-32.

Breiman, L. & Cutler, A. (n.d.). Random Forests. Retrieved from http://www.stat.berkeley.edu/~breiman/RandomForests/cc_home.htm.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
# Classify cars in the Cars93 data set by type (Compact, Large,
# Midsize, Small, Sporty, or Van).

# Load data.
data(Cars93, package="MASS")
x <- Cars93
y <- Cars93$Type

# Select variables with which to train model.
vars <- c(4:22)

# Run model, grow 30 trees.
forest <- bigrfc(x, y, ntree=30L, varselect=vars, cachepath=NULL)

# Calculate proximity matrix and scaling co-ordinates, and plot
# them.
prox <- proximities(forest, cachepath=NULL)
scale <- scaling(prox)
plot(scale, col=as.integer(y) + 2, pch=as.integer(y) + 2)

# Calculate outlier scores, and circle the top 20% percent of
# them in red.
outscores <- outliers(forest)
points(scale[outscores > quantile(outscores, probs=0.8), ],
    col=2, pch=1, cex=1.5)

aloysius-lim/bigrf documentation built on May 11, 2019, 11:20 p.m.