Description Format Details Active bindings Methods Author(s)
Detects and replaces possible outliers from data set.
R6::R6Class object.
Performes Grubb's test for outliers to detect outliers in the normalized and Z-score transfromed data set. Replace missing values with substitutes by classical and AI-powerd substitution algorithms. For this purpose outliers are handled as imputation sites.
outliersParameter
Returns the instance variable outliersParameter. (tibble::tibble)
outliers
Returns the instance variable outliers. (tibble::tibble)
one_hot_df
Returns the positions of missings in one_hot encoding (tibble::tibble)
outliersStatistics
Returns the instance variable outliersStatistics. (tibble::tibble)
outliersAgentAlphabet
Returns the instance variable of outliersAgentAlphabet (character)
outliersAgent
Returns the instance variable outliersAgent. (character)
setOutliersAgent
Sets the instance variable outliersAgent. (character)
featureData
Returns the instance variable featureData. (numeric)
alpha
Returns the instance variable alpha. (numeric)
setAlpha
Set the instance variable alpha. (numeric)
epsilon
Returns the instance variable epsilon. (numeric)
setEpsilon
Set the instance variable epsilon. (numeric)
minSamples
Returns the instance variable minSamples. (integer)
setMinSamples
Set the instance variable minSamples. (integer)
gamma
Returns the instance variable gamma. (numeric)
setGamma
Set the instance variable gamma. (numeric)
nu
Returns the instance variable nu. (numeric)
setNu
Set the instance variable nu. (numeric)
k
Returns the instance variable k (integer)
setK
Sets the instance variable k. (integer)
cutoff
Returns the instance variable cutoff. (numeric)
setCutoff
Sets the instance variable cutoff. (numeric)
seed
Returns the instance variable seed. (integer)
setSeed
Set the instance variable seed. (integer)
new()
Creates and returns a new pgu.outliers
object.
pgu.outliers$new( data_df = "tbl_df", alpha = 0.05, epsilon = 0.1, minSamples = 4, gamma = 0.05, nu = 0.1, k = 4, cutoff = 0.99, seed = 42 )
data_df
The data to be cleaned. (tibble::tibble)
alpha
Initial definition of the instance variable alpha. (numeric)
epsilon
Initial definition of the instance variable epsilon. (numeric)
minSamples
Initial definition of the instance variable minSamples. (integer)
gamma
Initial definition of the instance variable gamma. (numeric)
nu
Initial definition of the instance variable nu. (numeric)
k
Initial definition of the instance variable k. (integer)
cutoff
Initial definition of the instance variable cutoff. (numeric)
seed
Initial definition of the instance variable seed. (integer)
A new pgu.outliers
object.
(pguIMP::pgu.outliers)
finalize()
Clears the heap and
indicates that instance of pgu.outliers
is removed from heap.
pgu.outliers$finalize()
print()
Prints instance variables of a pgu.outliers
object.
pgu.outliers$print()
string
resetOutliers()
Resets instance variables and performes Grubb's test for outliers to detect outliers in the normalized and Z-score transfromed data set. Progresse is indicated by the progress object passed to the function.
pgu.outliers$resetOutliers(data_df = "tbl_df")
data_df
Dataframe to be analyzed. (tibble::tibble)
filterFeatures()
Filters attributes from the given dataframe that are known to the class.
pgu.outliers$filterFeatures(data_df = "tbl_df")
data_df
Dataframe to be filtered. (tibble::tibble)
A filterd dataframe. (tibble::tibble)
checkFeatureValidity()
Checks if the feature consists of a sufficient number of instances.
pgu.outliers$checkFeatureValidity(data_df = "tbl_df", feature = "character")
data_df
Dataframe to be analyzed (tibble::tibble)
feature
The attribute to be analyzed. (character)
detectOutliersParameter()
determines the outliers parameter by analyzing the tibble data_df and the instance variable outliers. Results are stored to instance variable outliersParameter.
pgu.outliers$detectOutliersParameter(data_df = "tbl_df")
data_df
Dataframe to be analyzed. (tibble::tibble)
outliersFeatureList()
Characterizes each row of the data frame as either complete
or indicates which attribute has been identified as an outlier within the row.
If multiple attributes' row entries were identified as outliers, the row is characterized by multiple
.
pgu.outliers$outliersFeatureList(data_df = "tbl_df")
data_df
The data frame to be analyzed. (tibble::tibble)
Vector of row characteristics. (character)
featureOutlier()
Returns the detected outliers of a given attribute.
pgu.outliers$featureOutlier(feature = "character")
feature
The attribute to be analyzed (character)
The attribute's outliers (tibble::tibble)
one_hot()
Gathers statistical information about missing values in one hot format. The result is stored in the instance variable one_hot_df.
pgu.outliers$one_hot(data_df = "tbl_df")
data_df
The data frame to be analyzed. (tibble::tibble)
detectOutliers()
Chooses a method for identification of anomalies based upon the instance variable outliersAgent
Detects anomalies in a data frame by one-dimensional analysis of each feature.
pgu.outliers$detectOutliers(data_df = "tbl_df", progress = "Progress")
data_df
Data frame to be analyzed. (tibble::tibble)
progress
If shiny is loaded, the analysis' progress is stored in this instance of the shiny Progress class. (shiny::Progress)
detectByGrubbs()
Identifies anomalies in the data frame based on Grubb's test.
Iterates over the whole data frame. Calls the object's public function
grubbs_numeric
until no more anomalies are identified.
The threshold for anomaly detection is defined in the instance variable alpha
.
Display the progress if shiny is loaded.
pgu.outliers$detectByGrubbs(data_df = "tbl_df", progress = "Progress")
data_df
Data frame to be analyzed. (tibble::tibble)
progress
If shiny is loaded, the analysis' progress is stored in this instance of the shiny Progress class. (shiny::Progress)
grubbs_numeric()
Performs Grubb's test for anomalies to detect a single outlier in the provided attributes data.
If an outlier is found, it is added to the instance variable outliers
.
The threshold for anomaly detection is difined in the instance variable alpha
.
The function indicates a find by a positive feedback.
pgu.outliers$grubbs_numeric(data_df = "tbl_df", feature = "character")
data_df
The data frame to be analyzed. (tibble::tibble)
feature
The attribute within the data frame to be analyzed.
Feedback if an outlier was found. (logical)
detectByDbscan()
Identifies anomalies in the data frame based on DBSCAN.
Iterates over the whole data frame. Calls the object's public function
dbscan_numeric
until all features are analyzed.
The cluster hyper parameter are defined in the instance variables epsilon
and minSamples
.
The results of the dbscan_numeric
routine are added to the instance variable outliers
.
Display the progress if shiny is loaded.
pgu.outliers$detectByDbscan(data_df = "tbl_df", progress = "Progress")
data_df
Data frame to be analyzed. (tibble::tibble)
progress
If shiny is loaded, the analysis' progress is stored in this instance of the shiny Progress class. (shiny::Progress)
dbscan_numeric()
Identifies anomalies in a single feature of a data frame based on DBSCAN.
The cluster hyperparameter are defined in the instance variables epsilon
and minSamples
.
Display the progress if shiny is loaded.
pgu.outliers$dbscan_numeric(data_df = "tbl_df", feature = "character")
data_df
Data frame to be analyzed. (tibble::tibble)
feature
Feature to be analyzed (character)
progress
If shiny is loaded, the analysis' progress is stored in this instance of the shiny Progress class. (shiny::Progress)
A data frame comprising the information about detected anomalies of the feature. (tibble::tibble)
detectBySvm()
Identifies anomalies in the data frame based on one class SVM.
Iterates over the whole data frame. Calls the object's public function
svm_numeric
until all features are analyzed.
The cluster hyper parameter are defined in the instance variables gamma
and nu
.
The results of the svm_numeric
routine are added to the instance variable outliers
.
Display the progress if shiny is loaded.
pgu.outliers$detectBySvm(data_df = "tbl_df", progress = "Process")
data_df
Data frame to be analyzed. (tibble::tibble)
progress
If shiny is loaded, the analysis' progress is stored in this instance of the shiny Progress class. (shiny::Progress)
svm_numeric()
Identifies anomalies in a single feature of a data frame based on one class SVM.
The cluster hyperparameter are defined in the instance variables gamma
and nu
.
Display the progress if shiny is loaded.
pgu.outliers$svm_numeric(data_df = "tbl_df", feature = "character")
data_df
Data frame to be analyzed. (tibble::tibble)
feature
Feature to be analyzed (character)
progress
If shiny is loaded, the analysis' progress is stored in this instance of the shiny Progress class. (shiny::Progress)
A data frame comprising the information about detected anomalies of the feature. (tibble::tibble)
detectByKnn()
Identifies anomalies in the data frame based on knnO.
Iterates over the whole data frame. Calls the object's public function
svm_numeric
until all features are analyzed.
The cluster hyper parameter are defined in the instance variables alpha
and minSamples
.
The results of the knn_numeric
routine are added to the instance variable outliers
.
Display the progress if shiny is loaded.
pgu.outliers$detectByKnn(data_df = "tbl_df", progress = "Process")
data_df
Data frame to be analyzed. (tibble::tibble)
progress
If shiny is loaded, the analysis' progress is stored in this instance of the shiny Progress class. (shiny::Progress)
knn_numeric()
Identifies anomalies in a single feature of a data frame based on knnO.
The cluster hyperparameter are defined in the instance variables alpha
and minSmaples
.
Display the progress if shiny is loaded.
pgu.outliers$knn_numeric(data_df = "tbl_df", feature = "character")
data_df
Data frame to be analyzed. (tibble::tibble)
feature
Feature to be analyzed (character)
progress
If shiny is loaded, the analysis' progress is stored in this instance of the shiny Progress class. (shiny::Progress)
A data frame comprising the information about detected anomalies of the feature. (tibble::tibble)
setImputationSites()
Replaces the detected anomalies of a user provided data frame with NA
for further imputation routines.
pgu.outliers$setImputationSites(data_df = "tbl_df")
data_df
Data frame to be mutated. (tibble::tibble)
A data frame with anomalies replaced by NA
.
(tibble::tibble)
calcOutliersStatistics()
Calculates the statistics on the previously performed outlier detection analysis
and stores the results in the instance variable outliersStatistcs
.
pgu.outliers$calcOutliersStatistics(data_df = "tbl_df")
data_df
The data frame to be analyzed. (tibble::tibble)
outlierTable()
Creates a datatable with substituted outliers highlightes by colored background.
pgu.outliers$outlierTable(data_df = "tbl_df")
data_df
The data frame to be analyzed. (tibble::tibble)
A colored datatable (DT::datatable)
plotOutliersDistribution()
Displays the occurrence of outlier candidates per attribute as bar plot.
pgu.outliers$plotOutliersDistribution()
A bar plot. (ggplot2::ggplot)
featureBarPlot()
Displays the distribution of an attribute's values as histogram.
pgu.outliers$featureBarPlot(data_df = "tbl_df", feature = "character")
data_df
dataframe to be analyzed. (tibble::tibble)
feature
attribute to be shown. (character)
A histogram. (ggplot2::ggplot)
featureBoxPlotWithSubset()
Displays the distribution of an attribute's vlues as box plot.
pgu.outliers$featureBoxPlotWithSubset( data_df = "tbl_df", feature = "character" )
data_df
dataframe to be analyzed. (tibble::tibble)
feature
attribute to be shown. (character)
A box plot. (ggplot2::ggplot)
featurePlot()
Displays the distribution of an attribute's values as a composition of a box plot and a histogram.
pgu.outliers$featurePlot(data_df = "tbl_df", feature = "character")
data_df
dataframe to be analyzed. (tibble::tibble)
feature
attribute to be shown. (character)
A composite plot. (ggplot2::ggplot)
clone()
The objects of this class are cloneable with this method.
pgu.outliers$clone(deep = FALSE)
deep
Whether to make a deep clone.
Sebastian Malkusch, malkusch@med.uni-frankfurt.de
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.