dif | R Documentation |
The function builds a deep isolation forest that uses fuzzy logic to determine if a record is anomalous on not.
The function takes a wide-format data.frame
object as input and returns it with two appended vectors.
The first vector contains the anomaly scores as numbers between zero and one, and the second vector provides
a set of logical values indicating whether the records are outliers (TRUE
) or not (FALSE
).
dif(dta, nt = 100L, nss = NULL, threshold = 0.95)
dta |
A wide-format |
nt |
Number of deep isolation trees to build to form the forest. By default, it is set to |
nss |
Number of subsamples used to build a single deep isolation tree.
If set (by default) to |
threshold |
A number between zero and one used as a threshold when identifying outliers from the anomaly scores.
By default, this argument is set to |
The argument dta
is proivded as an object of class data.frame
.
This object is considered as a wide-format data.frame
.
The use of the R-packages dplyr
, purrr
, and tidyr
is highly recommended to simplify the conversion of datasets between long and wide formats.
The wide-format data.frame
is provided as input data and contains extra columns, i.e., for both anomaly scores and the outlier flags.
Luca Scellwise artore drwolf85@gmail.com
# Load the package
library(HRTnomaly)
set.seed(2025L)
# Detect outliers in the `iris` dataset
res <- dif(iris)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.