Outliers Learn"
In OutliersLearn: Educational Outlier Package with Common Outlier Detection Algorithms

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
library(OutliersLearn);

The Outliers Learn R package allows users to learn how the outlier detection algorithms work.

The package includes the main functions that have the implementation of the algorithm
The package also includes some auxiliary functions used in the main functions that can also be used separately
The main functions include a tutorial mode parameter that allows the user to choose if wanted to see the description and a step by step explanation on how the algorithm works.

Datasets

In the following examples of use, most of these examples will always use the same dataset. This dataset is declared as inputData:

inputData = t(matrix(c(3,2,3.5,12,4.7,4.1,5.2,4.9,7.1,6.1,6.2,5.2,14,5.3),2,7,dimnames=list(c("r","d"))));
inputData = data.frame(inputData);
print(inputData);

As it can be seen, this is a bidimensional matrix (data.frame) that has 7 rows. It can be seen more graphically like this:

plot(inputData);

With that being said, the following section will be dedicated to "how to execute" the auxiliary functions.

Auxiliary functions

In this section, it will be shown how to call the auxiliary functions of the Outliers Learn R package. This includes:

Distance functions
euclidean_distance()
mahalanobis_distance()
manhattan_dist()
Statistical Functions
mean_outliersLearn()
sd_outliersLearn()
quantile_outliersLearn()
Data transforming functions
transform_to_vector()

First, the distance functions:

Euclidean Distance (euclidean_distance())

point1 = inputData[1,];
point2 = inputData[4,];
distance = euclidean_distance(point1, point2);
print(distance);

Mahalanobis Distance (mahalanobis_distance())

inputDataMatrix = as.matrix(inputData); #Required conversion for this function
sampleMeans = c();
#Calculate the mean for each column
for(i in 1:ncol(inputDataMatrix)){
  column = inputDataMatrix[,i];
  calculatedMean = sum(column)/length(column);
  sampleMeans = c(sampleMeans, calculatedMean);
}
#Calculate the covariance matrix
covariance_matrix = cov(inputDataMatrix);

distance = mahalanobis_distance(inputDataMatrix[3,], sampleMeans, covariance_matrix);
print(distance)

Manhattan Distance (manhattan_dist())

distance = manhattan_dist(c(1,2), c(3,4));
print(distance);

The statistical functions can be used like this:

Mean (mean_outliersLearn())

mean = mean_outliersLearn(inputData[,1]);
print(mean);

Standard Deviation (sd_outliersLearn())

sd = sd_outliersLearn(inputData[,1], mean);
print(sd);

Quantile (quantile_outliersLearn())

q = quantile_outliersLearn(c(12,2,3,4,1,13), 0.60); 
print(q);

Finally, the data-transforming function: - Transform to vector (transform_to_vector())

numeric_data = c(1, 2, 3)
character_data = c("a", "b", "c")
logical_data = c(TRUE, FALSE, TRUE)
factor_data = factor(c("A", "B", "A"))
integer_data = as.integer(c(1, 2, 3))
complex_data = complex(real = c(1, 2, 3), imaginary = c(4, 5, 6))
list_data = list(1, "apple", TRUE)
data_frame_data = data.frame(x = c(1, 2, 3), y = c("a", "b", "c"))

transformed_numeric = transform_to_vector(numeric_data);
print(transformed_numeric);
transformed_character = transform_to_vector(character_data);
print(transformed_character);
transformed_logical = transform_to_vector(logical_data);
print(transformed_logical);
transformed_factor = transform_to_vector(factor_data);
print(transformed_factor);
transformed_integer = transform_to_vector(integer_data);
print(transformed_integer);
transformed_complex = transform_to_vector(complex_data);
print(transformed_complex);
transformed_list = transform_to_vector(list_data);
print(transformed_list);
transformed_data_frame = transform_to_vector(data_frame_data);
print(transformed_data_frame);

Now that the auxiliary functions are understood, the main algorithms implemented for outlier detection will be detailed in the following section.

Main outlier detection methods

The main outlier detection methods implemented in the Outliers Learn package are:

box_and_whiskers()
DBSCAN_method()
knn()
lof()
mahalanobis_method()
z_score_method()

This section will be dedicated on showing how to use this algorithm implementations.

Box and Whiskers (`box_and_whiskers()`)

With the tutorial mode deactivated and d=2:

boxandwhiskers(inputData,2,FALSE)

With the tutorial mode activated and d=2:

boxandwhiskers(inputData,2,TRUE)

DBSCAN (`DBSCAN_method()`)

With the tutorial mode deactivated:

eps = 4;
min_pts = 3;
DBSCAN_method(inputData, eps, min_pts, FALSE);

With the tutorial mode activated:

eps = 4;
min_pts = 3;
DBSCAN_method(inputData, eps, min_pts, TRUE);

KNN (`knn()`)

With the tutorial mode deactivated, K=2 and d=3:

knn(inputData,3,2,FALSE)

With the tutorial mode activated, K=2 and d=3

knn(inputData,3,2,TRUE)

LOF simplified (`lof()`)

With the tutorial mode deactivated, K=3 and the threshold set to 0.5:

lof(inputData, 3, 0.5, FALSE);

With the tutorial mode activated and same input parameters:

lof(inputData, 3, 0.5, TRUE);

Mahalanobis Method (`mahalanobis_method()`)

With the tutorial mode deactivated and alpha set to 0.7:

mahalanobis_method(inputData, 0.7, FALSE);

With the tutorial mode activated and same value of alpha:

mahalanobis_method(inputData, 0.7, TRUE);

Z-score method (`z_score_method()`)

With the tutorial mode deactivated and d set to 2:

z_score_method(inputData,2,FALSE);

With the tutorial mode activated and same value of d:

z_score_method(inputData,2,TRUE);

Any scripts or data that you put into this service are public.

OutliersLearn documentation built on June 22, 2024, 10:23 a.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

OutliersLearn
Educational Outlier Package with Common Outlier Detection Algorithms

Outliers Learn"
In OutliersLearn: Educational Outlier Package with Common Outlier Detection Algorithms

Datasets

Auxiliary functions

Main outlier detection methods

Box and Whiskers (`box_and_whiskers()`)

DBSCAN (`DBSCAN_method()`)

KNN (`knn()`)

LOF simplified (`lof()`)

Mahalanobis Method (`mahalanobis_method()`)

Z-score method (`z_score_method()`)

Try the OutliersLearn package in your browser

R Package Documentation

Browse R Packages

We want your feedback!

OutliersLearn Educational Outlier Package with Common Outlier Detection Algorithms

Outliers Learn" In OutliersLearn: Educational Outlier Package with Common Outlier Detection Algorithms

Datasets

Auxiliary functions

Main outlier detection methods

Box and Whiskers (box_and_whiskers())

DBSCAN (DBSCAN_method())

KNN (knn())

LOF simplified (lof())

Mahalanobis Method (mahalanobis_method())

Z-score method (z_score_method())

Try the OutliersLearn package in your browser

R Package Documentation

Browse R Packages

We want your feedback!

OutliersLearn
Educational Outlier Package with Common Outlier Detection Algorithms

Outliers Learn"
In OutliersLearn: Educational Outlier Package with Common Outlier Detection Algorithms

Box and Whiskers (`box_and_whiskers()`)

DBSCAN (`DBSCAN_method()`)

KNN (`knn()`)

LOF simplified (`lof()`)

Mahalanobis Method (`mahalanobis_method()`)

Z-score method (`z_score_method()`)