Description Format Details Active bindings Methods Author(s)
Analyses and substitutes imputation sites in a data set.
R6::R6Class object.
Analyses imputation sites in a data set. Replaces imputation sites by missing values and substitutes NAs by classical and ML-powered substitution algorithms. This object is used by the shiny based gui and is not for use in individual R-scripts!
imputationStatisticsReturns the instance variable imputationStatistics. (tibble::tibble)
imputationSitesReturns the instance variable imputationSites. (tibble::tibble)
one_hot_dfReturns the positions of missings in one_hot encoding (tibble::tibble)
imputationSiteDistributionReturns the instance variable imputationSiteDistribution. (matrix)
imputationAgentAlphabetReturns the instance variable imputationagentAlphabet. (character)
imputationAgentReturns the instance variable imputationAgent. (character)
setImputationAgentSets the instance variable imputationAgent. (character)
nNeighborsReturns the instance variable nNeighbors. (integer)
setNNeighborsSets the instance variable nNeighbors. (integer)
flux_dfReturns the instance variable flux_df (tibble::tibble)
outflux_thrReturns the instance variable outflux_thr. (numeric)
setOutflux_thrSets the instance variable outflux_thr. (numeric)
pred_fracReturns the instance variable pred_frac. (numeric)
setPred_fracSets the instance variable pred_frac. (numeric)
pred_matReturns the instance variable pred_mat. (matrix)
exclude_vecReturns the instance variable exclude_vec (character)
seedReturns the instance variable seed. (numeric)
setSeedSets the instance variable seed. (numeric)
iterationsReturns the instance variable iterations. (numeric)
setIterationsSets the instance variable iterations. (numeric)
amvReturns the instance variable amv. (numeric)
successReturns the instance variable success. (logical)
new()Creates and returns a new pgu.imputation object.
pgu.imputation$new( seed = 42, iterations = 4, imputationAgent = "none", nNeighbors = 3, pred_frac = 1, outflux_thr = 0.5 )
seedInitially sets the instance variable seed. Default is 42. (integer)
iterationsInitially sets the instance variable iterations. Default is 4. (integer)
imputationAgentInitially sets the instance variable imputationAgent. Default is "none". Options are: ""none", "median", "mean", "expValue", "monteCarlo", "knn", "pmm", "cart", "randomForest", "M5P". (string)
nNeighborsInitially sets the instance variable nNeighbors. (integer)
pred_fracInitially sets the instance variable pred_frac. (numeric)
outflux_thrInitially sets the instance fariable outflux_thr
A new pgu.imputation object.
(pguIMP::pgu.imputation)
finalize()Clears the heap and
indicates that instance of pgu.imputation is removed from heap.
pgu.imputation$finalize()
print()Prints instance variables of a pgu.imputation object.
pgu.imputation$print()
string
gatherImputationSites()Gathers imputation sites from pguIMP's missings and outliers class.
pgu.imputation$gatherImputationSites( missings_df = "tbl_df", outliers_df = "tbl_df" )
missings_dfDataframe comprising information about the imputation sites of pguIMP's missings class. (tibble::tibble)
outliers_dfDataframe comprising information about the imputation sites of pguIMP's outliers class. (tibble::tibble)
gatherImputationSiteStatistics()Gathers statistical information about imputation sites
The information is stored within the classes instance variable imputationStatistics
pgu.imputation$gatherImputationSiteStatistics(data_df = "tbl_df")
data_dfThe data frame to be analyzed. (tibble::tibble)
gatherImputationSiteDistribution()Gathers the distribution of imputation sites within the data frame. The information is stored within the classes instance variable imputationSiteDistribution.
pgu.imputation$gatherImputationSiteDistribution(data_df = "tbl_df")
data_dfThe data frame to be analyzed. (tibble::tibble)
A data frame (tibble::tibble)
insertImputationSites()Takes a dataframe, replaces the imputation sites indicated by the instance variable imputationsites by NA,
and returns the mutated dataframe.
pgu.imputation$insertImputationSites(data_df = "tbl_df")
data_dfThe data frame to be analyzed. (tibble::tibble)
A mutated version of data_df. (tibble::tibble)
one_hot()Gathers statistical information about missing values in one hot format. The result is stored in the instance variable one_hot_df.
pgu.imputation$one_hot(data_df = "tbl_df")
data_dfThe data frame to be analyzed. (tibble::tibble)
analyzeImputationSites()Takes a dataframe and analyses the imputation sites.
pgu.imputation$analyzeImputationSites(data_df = "tbl_df")
data_dfThe data frame to be analyzed. (tibble::tibble)
imputationSiteIdxByFeature()Returns the position of an attribute's imputation sites within a data frame.
pgu.imputation$imputationSiteIdxByFeature(featureName = "character")
featureNameThe attribute's name. (character)
The postion of the imputation sites. (numeric)
nanFeatureList()Characterizes each row of the data frame as either complete
or indicates which attribute are missing within the row.
If multiple attributes' row entries are missing, the row is characterized by multiple.
pgu.imputation$nanFeatureList(data_df = "tbl_df")
data_dfThe data frame to be analyzed. (tibble::tibble)
Vector of row characteristics. (character)
average_number_of_predictors()Calculates the average number of predictors for a given dataframe and minpuc and mincor variables using the mice::quickpred routine.
pgu.imputation$average_number_of_predictors( data_df = "tbl_df", minpuc = 0, mincor = 0.1 )
data_dfThe dataframe to be analyzed (tibble::tibble)
minpucSpecifies the minimum threshold for the proportion of usable cases. (numeric)
mincorSpecifies the minimum threshold against which the absolute correlation in the dataframe is compared. (numeric)
Average_number_of_predictors. (numeric)
detectPredictors()Identifies possible predictors for each feature. Analysis results are written to the instance variable pred_mat. Intermediate analysis results are an influx/outflux dataframe that is written to the instance variable flux_df and detect predictors and a list of features that is excluded from the search for possible predictors that is written to the instance variable exclude_vec.
pgu.imputation$detectPredictors(data_df = "tbl_df")
data_dfThe dataframe to be analyzed. (tibble::tibble)
handleImputationSites()Chooses a cleaning method based upon the instance variable imputationAgent
and handles the imputation sites in the dataframe.
Returns a cleaned data set.
Display the progress if shiny is loaded.
pgu.imputation$handleImputationSites(data_df = "tbl_df", progress = "Progress")
data_dfThe data frame to be analyzed. (tibble::tibble)
progressIf shiny is loaded, the analysis' progress is stored within this instance of the shiny Progress class. (shiny::Progress)
Cleaned dataframe. (tibble:tibble)
imputeByMedian()Substitutes imputation sites by the median of the respective attribute. Returns the cleaned dataframe. Display the progress if shiny is loaded.
pgu.imputation$imputeByMedian(data_df = "tbl_df", progress = "Progress")
data_dfThe data frame to be analyzed. (tibble::tibble)
progressIf shiny is loaded, the analysis' progress is stored in this instance of the shiny Progress class. (shiny::Progress)
Cleaned dataframe. (tibble:tibble)
imputeByMean()Substitutes imputation sites by the aritmertic mean of the respective attribute. Returns the cleaned dataframe. Display the progress if shiny is loaded.
pgu.imputation$imputeByMean(data_df = "tbl_df", progress = "Progress")
data_dfThe data frame to be analyzed. (tibble::tibble)
progressIf shiny is loaded, the analysis' progress is stored in this instance of the shiny Progress class. (shiny::Progress)
Cleaned dataframe. (tibble:tibble)
imputeByExpectationValue()Substitutes imputation sites by the expectation value of the respective attribute. Returns the cleaned dataframe. Display the progress if shiny is loaded.
pgu.imputation$imputeByExpectationValue( data_df = "tbl_df", progress = "Progress" )
data_dfThe data frame to be analyzed. (tibble::tibble)
progressIf shiny is loaded, the analysis' progress is stored in this instance of the shiny Progress class. (shiny::Progress)
Cleaned dataframe. (tibble:tibble)
imputeByMC()Substitutes imputation sites by values generated by a monte carlo simulation.
The procedure runs several times as defined by the instance variable iterations.
The run with the best result is identified and used for substitution.
Returns the cleaned dataframe.
Display the progress if shiny is loaded.
pgu.imputation$imputeByMC(data_df = "tbl_df", progress = "Progress")
data_dfThe data frame to be analyzed. (tibble::tibble)
progressIf shiny is loaded, the analysis' progress is stored in this instance of the shiny Progress class. (shiny::Progress)
Cleaned dataframe. (tibble:tibble)
imputeByKnn()Substitutes imputation sites by predictions of a KNN analysis of the whole dataframe. Returns the cleaned dataframe. Display the progress if shiny is loaded.
pgu.imputation$imputeByKnn(data_df = "tbl_df", progress = "Progress")
data_dfThe data frame to be analyzed. (tibble::tibble)
progressIf shiny is loaded, the analysis' progress is stored in this instance of the shiny Progress class. (shiny::Progress)
Cleaned dataframe. (tibble:tibble)
imputeByMice()Substitutes imputation sites by values generated by a different methods of the mice package.
The procedure runs several times as defined by the instance variable iterations.
The run with the best result is identified and used for substitution.
Returns the cleaned dataframe.
Display the progress if shiny is loaded.
pgu.imputation$imputeByMice(data_df, progress = "Progress")
data_dfThe data frame to be analyzed. (tibble::tibble)
progressIf shiny is loaded, the analysis' progress is stored in this instance of the shiny Progress class. (shiny::Progress)
Cleaned dataframe. (tibble:tibble)
imputeByM5P()Substitutes imputation sites by predictions of a M5P tree trained on the whole dataframe. Returns the cleaned dataframe. Display the progress if shiny is loaded.
pgu.imputation$imputeByM5P(data_df = "tbl_df", progress = "Progress")
data_dfThe data frame to be analyzed. (tibble::tibble)
progressIf shiny is loaded, the analysis' progress is stored in this instance of the shiny Progress class. (shiny::Progress)
Cleaned dataframe. (tibble:tibble)
imputationSiteHeatMap()Displays the distribution of missing values in form of a heatmap.
pgu.imputation$imputationSiteHeatMap()
A heatmap plot. (ggplot2::ggplot)
featureBarPlot()Displays the distribution of an attribute values as histogram.
pgu.imputation$featureBarPlot(data_df = "tbl_df", feature = "character")
data_dfdataframe to be analyzed. (tibble::tibble)
featureattribute to be shown. (character)
A histogram. (ggplot2::ggplot)
featureBoxPlotWithSubset()Displays the distribution of an attribute's values as box plot.
pgu.imputation$featureBoxPlotWithSubset( data_df = "tbl_df", feature = "character" )
data_dfdataframe to be analyzed. (tibble::tibble)
featureattribute to be shown. (character)
A box plot. (ggplot2::ggplot)
featurePlot()Displays the distribution of an attribute's values as a composition of a box plot and a histogram.
pgu.imputation$featurePlot(data_df = "tbl_df", feature = "character")
data_dfdataframe to be analyzed. (tibble::tibble)
featureattribute to be shown. (character)
A composite plot. (ggplot2::ggplot)
fluxPlot()Displays an influx/outflux plot
pgu.imputation$fluxPlot()
A composite plot. (ggplot2::ggplot)
clone()The objects of this class are cloneable with this method.
pgu.imputation$clone(deep = FALSE)
deepWhether to make a deep clone.
Sebastian Malkusch, malkusch@med.uni-frankfurt.de
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.