A random forest based proximity function

Description

Random forest computes similarity between instances with classification of out-of-bag instances. If two out-of-bag cases are classified in the same tree leaf the proximity between them is incremented.

Usage

1
rfProximity(model, outProximity=TRUE)

Arguments

model

a CORElearn model of type random forest.

outProximity

if TRUE, function returns a proximity matrix, else it returns a distance matrix.

Details

A proximity is transformed into distance with expression distance=sqrt(1-proximity).

Value

Function returns an M by M matrix where M is the number of training instances. Returned matrix is used as an input to other function (see rfOutliers and rfClustering).

Author(s)

John Adeyanju Alao (as a part of his BSc thesis) and Marko Robnik-Sikonja (thesis supervisor)

References

Leo Breiman: Random Forests. Machine Learning Journal, 45:5-32, 2001

See Also

CoreModel, rfOutliers, cmdscale, rfClustering.

Examples

1
2
3
4
5
6
7
md <- CoreModel(Species ~ ., iris, model="rf", rfNoTrees=30, maxThreads=1)
pr <- rfProximity(md, outProximity=TRUE)
# visualization
require(lattice)
levelplot(pr)

destroyModels(md) # clean up

Want to suggest features or report bugs for rdrr.io? Use the GitHub issue tracker.