outForest: Multivariate Outlier Detection and Replacement

Provides a random forest based implementation of the method described in Chapter 7.1.2 (Regression model based anomaly detection) of Chandola et al. (2009) <doi:10.1145/1541880.1541882>. It works as follows: Each numeric variable is regressed onto all other variables by a random forest. If the scaled absolute difference between observed value and out-of-bag prediction of the corresponding random forest is suspiciously large, then a value is considered an outlier. The package offers different options to replace such outliers, e.g. by realistic values found via predictive mean matching. Once the method is trained on a reference data, it can be applied to new data.

Package details

AuthorMichael Mayer [aut, cre]
MaintainerMichael Mayer <mayermichael79@gmail.com>
LicenseGPL (>= 2)
URL https://github.com/mayer79/outForest
Package repositoryView on CRAN
Installation Install the latest version of this package by entering the following in R:

Try the outForest package in your browser

Any scripts or data that you put into this service are public.

outForest documentation built on Jan. 7, 2021, 9:10 a.m.