rfOutliers: Random forest based outlier detection

Description Usage Arguments Details Value Author(s) References See Also Examples

View source: R/rfVisualize.R

Description

Based on random forest instance proximity measure detects training cases which are different to all other cases.

Usage

1
rfOutliers(model, dataset)

Arguments

model

a random forest model returned by CoreModel

dataset

a training set used to generate the model

Details

Strangeness is defined using the random forest model via a proximity matrix (see rfProximity). If the number is greater than 10, the case can be considered an outlier according to Breiman 2001.

Value

For each instance from a dataset the function returns a numeric score of its strangeness to other cases.

Author(s)

John Adeyanju Alao (as a part of his BSc thesis) and Marko Robnik-Sikonja (thesis supervisor)

References

Leo Breiman: Random Forests. Machine Learning Journal, 45:5-32, 2001

See Also

CoreModel, rfProximity, rfClustering.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
#first create a random forest tree using CORElearn
dataset <- iris
md <- CoreModel(Species ~ ., dataset, model="rf", rfNoTrees=30, 
                maxThreads=1)
outliers <- rfOutliers(md, dataset)
plot(abs(outliers))
#for a nicer display try 
plot(md, dataset, rfGraphType="outliers")

destroyModels(md) # clean up

CORElearn documentation built on March 23, 2021, 9:07 a.m.