Description Format Details Active bindings Methods Author(s)
Detects and replaces possible outliers from data set.
R6::R6Class object.
Performes Grubb's test for outliers to detect outliers in the normalized and Z-score transfromed data set. Replace missing values with substitutes by classical and AI-powerd substitution algorithms. For this purpose outliers are handled as imputation sites.
outliersParameterReturns the instance variable outliersParameter. (tibble::tibble)
outliersReturns the instance variable outliers. (tibble::tibble)
one_hot_dfReturns the positions of missings in one_hot encoding (tibble::tibble)
outliersStatisticsReturns the instance variable outliersStatistics. (tibble::tibble)
outliersAgentAlphabetReturns the instance variable of outliersAgentAlphabet (character)
outliersAgentReturns the instance variable outliersAgent. (character)
setOutliersAgentSets the instance variable outliersAgent. (character)
featureDataReturns the instance variable featureData. (numeric)
alphaReturns the instance variable alpha. (numeric)
setAlphaSet the instance variable alpha. (numeric)
epsilonReturns the instance variable epsilon. (numeric)
setEpsilonSet the instance variable epsilon. (numeric)
minSamplesReturns the instance variable minSamples. (integer)
setMinSamplesSet the instance variable minSamples. (integer)
gammaReturns the instance variable gamma. (numeric)
setGammaSet the instance variable gamma. (numeric)
nuReturns the instance variable nu. (numeric)
setNuSet the instance variable nu. (numeric)
kReturns the instance variable k (integer)
setKSets the instance variable k. (integer)
cutoffReturns the instance variable cutoff. (numeric)
setCutoffSets the instance variable cutoff. (numeric)
seedReturns the instance variable seed. (integer)
setSeedSet the instance variable seed. (integer)
new()Creates and returns a new pgu.outliers object.
pgu.outliers$new( data_df = "tbl_df", alpha = 0.05, epsilon = 0.1, minSamples = 4, gamma = 0.05, nu = 0.1, k = 4, cutoff = 0.99, seed = 42 )
data_dfThe data to be cleaned. (tibble::tibble)
alphaInitial definition of the instance variable alpha. (numeric)
epsilonInitial definition of the instance variable epsilon. (numeric)
minSamplesInitial definition of the instance variable minSamples. (integer)
gammaInitial definition of the instance variable gamma. (numeric)
nuInitial definition of the instance variable nu. (numeric)
kInitial definition of the instance variable k. (integer)
cutoffInitial definition of the instance variable cutoff. (numeric)
seedInitial definition of the instance variable seed. (integer)
A new pgu.outliers object.
(pguIMP::pgu.outliers)
finalize()Clears the heap and
indicates that instance of pgu.outliers is removed from heap.
pgu.outliers$finalize()
print()Prints instance variables of a pgu.outliers object.
pgu.outliers$print()
string
resetOutliers()Resets instance variables and performes Grubb's test for outliers to detect outliers in the normalized and Z-score transfromed data set. Progresse is indicated by the progress object passed to the function.
pgu.outliers$resetOutliers(data_df = "tbl_df")
data_dfDataframe to be analyzed. (tibble::tibble)
filterFeatures()Filters attributes from the given dataframe that are known to the class.
pgu.outliers$filterFeatures(data_df = "tbl_df")
data_dfDataframe to be filtered. (tibble::tibble)
A filterd dataframe. (tibble::tibble)
checkFeatureValidity()Checks if the feature consists of a sufficient number of instances.
pgu.outliers$checkFeatureValidity(data_df = "tbl_df", feature = "character")
data_dfDataframe to be analyzed (tibble::tibble)
featureThe attribute to be analyzed. (character)
detectOutliersParameter()determines the outliers parameter by analyzing the tibble data_df and the instance variable outliers. Results are stored to instance variable outliersParameter.
pgu.outliers$detectOutliersParameter(data_df = "tbl_df")
data_dfDataframe to be analyzed. (tibble::tibble)
outliersFeatureList()Characterizes each row of the data frame as either complete
or indicates which attribute has been identified as an outlier within the row.
If multiple attributes' row entries were identified as outliers, the row is characterized by multiple.
pgu.outliers$outliersFeatureList(data_df = "tbl_df")
data_dfThe data frame to be analyzed. (tibble::tibble)
Vector of row characteristics. (character)
featureOutlier()Returns the detected outliers of a given attribute.
pgu.outliers$featureOutlier(feature = "character")
featureThe attribute to be analyzed (character)
The attribute's outliers (tibble::tibble)
one_hot()Gathers statistical information about missing values in one hot format. The result is stored in the instance variable one_hot_df.
pgu.outliers$one_hot(data_df = "tbl_df")
data_dfThe data frame to be analyzed. (tibble::tibble)
detectOutliers()Chooses a method for identification of anomalies based upon the instance variable outliersAgent
Detects anomalies in a data frame by one-dimensional analysis of each feature.
pgu.outliers$detectOutliers(data_df = "tbl_df", progress = "Progress")
data_dfData frame to be analyzed. (tibble::tibble)
progressIf shiny is loaded, the analysis' progress is stored in this instance of the shiny Progress class. (shiny::Progress)
detectByGrubbs()Identifies anomalies in the data frame based on Grubb's test.
Iterates over the whole data frame. Calls the object's public function
grubbs_numeric until no more anomalies are identified.
The threshold for anomaly detection is defined in the instance variable alpha.
Display the progress if shiny is loaded.
pgu.outliers$detectByGrubbs(data_df = "tbl_df", progress = "Progress")
data_dfData frame to be analyzed. (tibble::tibble)
progressIf shiny is loaded, the analysis' progress is stored in this instance of the shiny Progress class. (shiny::Progress)
grubbs_numeric()Performs Grubb's test for anomalies to detect a single outlier in the provided attributes data.
If an outlier is found, it is added to the instance variable outliers.
The threshold for anomaly detection is difined in the instance variable alpha.
The function indicates a find by a positive feedback.
pgu.outliers$grubbs_numeric(data_df = "tbl_df", feature = "character")
data_dfThe data frame to be analyzed. (tibble::tibble)
featureThe attribute within the data frame to be analyzed.
Feedback if an outlier was found. (logical)
detectByDbscan()Identifies anomalies in the data frame based on DBSCAN.
Iterates over the whole data frame. Calls the object's public function
dbscan_numeric until all features are analyzed.
The cluster hyper parameter are defined in the instance variables epsilon and minSamples.
The results of the dbscan_numeric routine are added to the instance variable outliers.
Display the progress if shiny is loaded.
pgu.outliers$detectByDbscan(data_df = "tbl_df", progress = "Progress")
data_dfData frame to be analyzed. (tibble::tibble)
progressIf shiny is loaded, the analysis' progress is stored in this instance of the shiny Progress class. (shiny::Progress)
dbscan_numeric()Identifies anomalies in a single feature of a data frame based on DBSCAN.
The cluster hyperparameter are defined in the instance variables epsilon and minSamples.
Display the progress if shiny is loaded.
pgu.outliers$dbscan_numeric(data_df = "tbl_df", feature = "character")
data_dfData frame to be analyzed. (tibble::tibble)
featureFeature to be analyzed (character)
progressIf shiny is loaded, the analysis' progress is stored in this instance of the shiny Progress class. (shiny::Progress)
A data frame comprising the information about detected anomalies of the feature. (tibble::tibble)
detectBySvm()Identifies anomalies in the data frame based on one class SVM.
Iterates over the whole data frame. Calls the object's public function
svm_numeric until all features are analyzed.
The cluster hyper parameter are defined in the instance variables gamma and nu.
The results of the svm_numeric routine are added to the instance variable outliers.
Display the progress if shiny is loaded.
pgu.outliers$detectBySvm(data_df = "tbl_df", progress = "Process")
data_dfData frame to be analyzed. (tibble::tibble)
progressIf shiny is loaded, the analysis' progress is stored in this instance of the shiny Progress class. (shiny::Progress)
svm_numeric()Identifies anomalies in a single feature of a data frame based on one class SVM.
The cluster hyperparameter are defined in the instance variables gamma and nu.
Display the progress if shiny is loaded.
pgu.outliers$svm_numeric(data_df = "tbl_df", feature = "character")
data_dfData frame to be analyzed. (tibble::tibble)
featureFeature to be analyzed (character)
progressIf shiny is loaded, the analysis' progress is stored in this instance of the shiny Progress class. (shiny::Progress)
A data frame comprising the information about detected anomalies of the feature. (tibble::tibble)
detectByKnn()Identifies anomalies in the data frame based on knnO.
Iterates over the whole data frame. Calls the object's public function
svm_numeric until all features are analyzed.
The cluster hyper parameter are defined in the instance variables alpha and minSamples.
The results of the knn_numeric routine are added to the instance variable outliers.
Display the progress if shiny is loaded.
pgu.outliers$detectByKnn(data_df = "tbl_df", progress = "Process")
data_dfData frame to be analyzed. (tibble::tibble)
progressIf shiny is loaded, the analysis' progress is stored in this instance of the shiny Progress class. (shiny::Progress)
knn_numeric()Identifies anomalies in a single feature of a data frame based on knnO.
The cluster hyperparameter are defined in the instance variables alpha and minSmaples.
Display the progress if shiny is loaded.
pgu.outliers$knn_numeric(data_df = "tbl_df", feature = "character")
data_dfData frame to be analyzed. (tibble::tibble)
featureFeature to be analyzed (character)
progressIf shiny is loaded, the analysis' progress is stored in this instance of the shiny Progress class. (shiny::Progress)
A data frame comprising the information about detected anomalies of the feature. (tibble::tibble)
setImputationSites()Replaces the detected anomalies of a user provided data frame with NA for further imputation routines.
pgu.outliers$setImputationSites(data_df = "tbl_df")
data_dfData frame to be mutated. (tibble::tibble)
A data frame with anomalies replaced by NA.
(tibble::tibble)
calcOutliersStatistics()Calculates the statistics on the previously performed outlier detection analysis
and stores the results in the instance variable outliersStatistcs.
pgu.outliers$calcOutliersStatistics(data_df = "tbl_df")
data_dfThe data frame to be analyzed. (tibble::tibble)
outlierTable()Creates a datatable with substituted outliers highlightes by colored background.
pgu.outliers$outlierTable(data_df = "tbl_df")
data_dfThe data frame to be analyzed. (tibble::tibble)
A colored datatable (DT::datatable)
plotOutliersDistribution()Displays the occurrence of outlier candidates per attribute as bar plot.
pgu.outliers$plotOutliersDistribution()
A bar plot. (ggplot2::ggplot)
featureBarPlot()Displays the distribution of an attribute's values as histogram.
pgu.outliers$featureBarPlot(data_df = "tbl_df", feature = "character")
data_dfdataframe to be analyzed. (tibble::tibble)
featureattribute to be shown. (character)
A histogram. (ggplot2::ggplot)
featureBoxPlotWithSubset()Displays the distribution of an attribute's vlues as box plot.
pgu.outliers$featureBoxPlotWithSubset( data_df = "tbl_df", feature = "character" )
data_dfdataframe to be analyzed. (tibble::tibble)
featureattribute to be shown. (character)
A box plot. (ggplot2::ggplot)
featurePlot()Displays the distribution of an attribute's values as a composition of a box plot and a histogram.
pgu.outliers$featurePlot(data_df = "tbl_df", feature = "character")
data_dfdataframe to be analyzed. (tibble::tibble)
featureattribute to be shown. (character)
A composite plot. (ggplot2::ggplot)
clone()The objects of this class are cloneable with this method.
pgu.outliers$clone(deep = FALSE)
deepWhether to make a deep clone.
Sebastian Malkusch, malkusch@med.uni-frankfurt.de
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.