iForest: iForest
In Zelazny7/isofor: Isolation Forest Anomaly Detection

Description Usage Arguments Details Value References Examples

View source: R/IsolationForest.R

Build an Isolation Forest of completely random trees

1 2	iForest(X, nt = 100, phi = 256, seed = 1234, replace_missing = TRUE, sentinel = -9999999999, ncolsample = NULL)

`X`	a matrix or data.frame of numeric or factors values
`nt`	the number of trees in the ensemble
`phi`	the number of samples to draw without replacement to construct each tree
`seed`	random seed to ensure creation of reproducible foresets
`replace_missing`	if TRUE, replaces missing factor levels with "." and missing numeric values with the `sentinel` argument
`sentinel`	value to use as stand-in for missing numeric values
`ncolsample`	if not NULL, the default, 'ncolsample' features are subsampled for each tree. See details for more information.

An Isolation Forest is an unsupervised anomaly detection algorithm. The requested number of trees, nt, are built completely at random on a subsample of size phi. At each node a random variable is selected. A random split is chosen from the range of that variable. A random sample of factor levels are chosen in the case the variable is a factor.

Records from X are then filtered based on the split criterion and the tree building begins again on the left and right subsets of the data. Tree building terminates when the maximum depth of the tree is reached or there are 1 or fewer observations in the filtered subset.

If ncolsample is not null, the algorithm will subsample a number of features equal to ncolsample to construct each tree. The features are not sampled randomly, but use an appropriate measure for the column's class. This measure is first applied to all columns. The decreasing order is then calculated and the first ncolsample columns are taken from this ordering. Numeric fields are ordered using kurtosis while factors use Shannon entropy.

an iForest object

F. T. Liu, K. M. Ting, Z.-H. Zhou, "Isolation-based anomaly detection", ACM Trans. Knowl. Discov. Data, vol. 6, no. 1, pp. 3:1-3:39, Mar. 2012.