iForest: iForest

Description Usage Arguments Details Value References Examples

View source: R/IsolationForest.R

Description

Build an Isolation Forest of completely random trees

Usage

1
2
iForest(X, nt = 100, phi = 256, seed = 1234,
  replace_missing = TRUE, sentinel = -9999999999, ncolsample = NULL)

Arguments

X

a matrix or data.frame of numeric or factors values

nt

the number of trees in the ensemble

phi

the number of samples to draw without replacement to construct each tree

seed

random seed to ensure creation of reproducible foresets

replace_missing

if TRUE, replaces missing factor levels with "." and missing numeric values with the sentinel argument

sentinel

value to use as stand-in for missing numeric values

ncolsample

if not NULL, the default, 'ncolsample' features are subsampled for each tree. See details for more information.

Details

An Isolation Forest is an unsupervised anomaly detection algorithm. The requested number of trees, nt, are built completely at random on a subsample of size phi. At each node a random variable is selected. A random split is chosen from the range of that variable. A random sample of factor levels are chosen in the case the variable is a factor.

Records from X are then filtered based on the split criterion and the tree building begins again on the left and right subsets of the data. Tree building terminates when the maximum depth of the tree is reached or there are 1 or fewer observations in the filtered subset.

If ncolsample is not null, the algorithm will subsample a number of features equal to ncolsample to construct each tree. The features are not sampled randomly, but use an appropriate measure for the column's class. This measure is first applied to all columns. The decreasing order is then calculated and the first ncolsample columns are taken from this ordering. Numeric fields are ordered using kurtosis while factors use Shannon entropy.

Value

an iForest object

References

F. T. Liu, K. M. Ting, Z.-H. Zhou, "Isolation-based anomaly detection", ACM Trans. Knowl. Discov. Data, vol. 6, no. 1, pp. 3:1-3:39, Mar. 2012.

Examples

1
2
library(isofor)
mod1 <- iForest(iris, phi = 16, nt=100)

Zelazny7/isofor documentation built on Aug. 28, 2019, 7:12 p.m.