outqrf: find outliers

View source: R/outqrf.r

outqrfR Documentation

find outliers

Description

This function finds outliers in a dataset using quantile random forests.

Usage

outqrf(
  data,
  quantiles_type = 1000,
  threshold = 0.025,
  impute = TRUE,
  verbose = 1,
  weight = FALSE,
  ...
)

Arguments

data

a data frame

quantiles_type

'1000':seq(from = 0.001, to = 0.999, by = 0.001), '400':seq(0.0025,0.9975,0.0025)

threshold

a threshold for outlier detection

impute

a boolean value indicating whether to impute missing values

verbose

a boolean value indicating whether to print verbose output

weight

a boolean value indicating whether to use weight. if TRUE, The actual threshold will be threshold*r2.

...

additional arguments passed to the ranger function

Value

An object of class "outqrf" and a list with the following elements.

  • Data: Original data set in unchanged row order

  • outliers: Compact representation of outliers. Each row corresponds to an outlier and contains the following columns:

    • row: Row number of the outlier

    • col: Variable name of the outlier

    • observed: value of the outlier

    • predicted: predicted value of the outlier

    • rank: Rank of the outlier

  • outMatrix: Predicted value at different quantiles for each observation

  • r.squared: R-squared value of the quantile random forest model

  • outMatrix: Predicted value at different quantiles for each observation

  • r.squared: R-squared value of the quantile random forest model

  • oob.error: Out-of-bag error of the quantile random forest model

  • rmse: RMSE of the quantile random forest model

  • threshold: Threshold for outlier detection

Examples

iris_with_outliers <- generateOutliers(iris, p=0.05)
qrf = outqrf(iris_with_outliers)
qrf$outliers
evaluateOutliers(iris,iris_with_outliers,qrf$outliers)

outqrf documentation built on Sept. 11, 2024, 8:47 p.m.