knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "man/figures/README-",
  out.width = "100%"
)

outselect

Travis build status

The goal of outselect (outlier detection method selection) is to select suitable outlier detection methods for a given dataset using meta-features. The functionality is available to reproduce some instance space results in our papers [@normalizationoutliers] and [@outliersinstance].

This package is still under development and this repository contains a development version of the R package outselect.

Installation

You can install outselect from github with:

#install.packages("devtools")
devtools::install_github("sevvandi/outselect")

Features

Details on features can be found here.

Min-Max Normalization

These examples are related to the work in [@outliersinstance] and use Min-Max normalization and the outlier detection methods described in [@campos2016evaluation]. For all examples we use the dataset Arrhythmia_withoutdupl_05_v05, which is described in [@campos2016evaluation].

Example 1

This example shows how to compute the meta-features and predict which outlier method is good for the dataset Arrhythmia_withoutdupl_05_v05.

library(outselect)
data(Arrhythmia_withoutdupl_05_v05)
dat <- Arrhythmia_withoutdupl_05_v05
feat <- ComputeMetaFeaturesMM(dat)
fit <- TrainModels(1,1,1)
out <- PredictPerformance(feat, fit)
out

Example 2

This example shows how to plot the instance Arrhythmia_withoutdupl_05_v05 on the Min-Max instance space.

library(outselect)
data(Arrhythmia_withoutdupl_05_v05)
dat <- Arrhythmia_withoutdupl_05_v05
feat <- ComputeMetaFeaturesMM(dat)
svmout <- InstSpace(d=1)
PlotNewInstance(svmout, feat, vis=TRUE)

Min-Max and Median-IQR normalization methods

These examples are related to the work in [@normalizationoutliers]. We use Min-Max and Median-IQR normalization methods for feature computation. For the instance space we use the following normalization and outlier detection method combinations:

  1. Ensemble Median-IQR
  2. LOF Min-Max
  3. KNN Median-IQR
  4. FAST ABOD Min-Max
  5. iForest Median-IQR
  6. KDEOS Median-IQR
  7. KDEOS Min-Max and
  8. LDF Min-Max

Again, for all examples we use the dataset Arrhythmia_withoutdupl_05_v05, which is described in [@campos2016evaluation].

Example 3

This example shows how to compute the meta-features and predict which outlier-normalization combination is good for the dataset Arrhythmia_withoutdupl_05_v05.

library(outselect)
data(Arrhythmia_withoutdupl_05_v05)
dat <- Arrhythmia_withoutdupl_05_v05
feat <- ComputeMetaFeaturesAll(dat)
fit <- TrainModels(d=2,1,1)
out <- PredictPerformance(feat, fit)
out

Example 4

This example plots the same instance in the outlier-normalization algorithm instance space.

library(outselect)
data(Arrhythmia_withoutdupl_05_v05)
dat <- Arrhythmia_withoutdupl_05_v05
feat <- ComputeMetaFeaturesAll(dat)
svmout <- InstSpace(d=2)
PlotNewInstance(svmout, feat, vis=TRUE)

References



sevvandi/outselect documentation built on June 1, 2019, 3:58 a.m.