Description Format Details Active bindings Methods Author(s)
Analyses and substitutes imputation sites in a data set.
R6::R6Class object.
Analyses imputation sites in a data set. Replaces imputation sites by missing values and substitutes NAs by classical and ML-powered substitution algorithms. This object is used by the shiny based gui and is not for use in individual R-scripts!
imputationStatistics
Returns the instance variable imputationStatistics. (tibble::tibble)
imputationSites
Returns the instance variable imputationSites. (tibble::tibble)
one_hot_df
Returns the positions of missings in one_hot encoding (tibble::tibble)
imputationSiteDistribution
Returns the instance variable imputationSiteDistribution. (matrix)
imputationAgentAlphabet
Returns the instance variable imputationagentAlphabet. (character)
imputationAgent
Returns the instance variable imputationAgent. (character)
setImputationAgent
Sets the instance variable imputationAgent. (character)
nNeighbors
Returns the instance variable nNeighbors. (integer)
setNNeighbors
Sets the instance variable nNeighbors. (integer)
flux_df
Returns the instance variable flux_df (tibble::tibble)
outflux_thr
Returns the instance variable outflux_thr. (numeric)
setOutflux_thr
Sets the instance variable outflux_thr. (numeric)
pred_frac
Returns the instance variable pred_frac. (numeric)
setPred_frac
Sets the instance variable pred_frac. (numeric)
pred_mat
Returns the instance variable pred_mat. (matrix)
exclude_vec
Returns the instance variable exclude_vec (character)
seed
Returns the instance variable seed. (numeric)
setSeed
Sets the instance variable seed. (numeric)
iterations
Returns the instance variable iterations. (numeric)
setIterations
Sets the instance variable iterations. (numeric)
amv
Returns the instance variable amv. (numeric)
success
Returns the instance variable success. (logical)
new()
Creates and returns a new pgu.imputation
object.
pgu.imputation$new( seed = 42, iterations = 4, imputationAgent = "none", nNeighbors = 3, pred_frac = 1, outflux_thr = 0.5 )
seed
Initially sets the instance variable seed. Default is 42. (integer)
iterations
Initially sets the instance variable iterations. Default is 4. (integer)
imputationAgent
Initially sets the instance variable imputationAgent. Default is "none". Options are: ""none", "median", "mean", "expValue", "monteCarlo", "knn", "pmm", "cart", "randomForest", "M5P". (string)
nNeighbors
Initially sets the instance variable nNeighbors. (integer)
pred_frac
Initially sets the instance variable pred_frac. (numeric)
outflux_thr
Initially sets the instance fariable outflux_thr
A new pgu.imputation
object.
(pguIMP::pgu.imputation)
finalize()
Clears the heap and
indicates that instance of pgu.imputation
is removed from heap.
pgu.imputation$finalize()
print()
Prints instance variables of a pgu.imputation
object.
pgu.imputation$print()
string
gatherImputationSites()
Gathers imputation sites from pguIMP's missings and outliers class.
pgu.imputation$gatherImputationSites( missings_df = "tbl_df", outliers_df = "tbl_df" )
missings_df
Dataframe comprising information about the imputation sites of pguIMP's missings class. (tibble::tibble)
outliers_df
Dataframe comprising information about the imputation sites of pguIMP's outliers class. (tibble::tibble)
gatherImputationSiteStatistics()
Gathers statistical information about imputation sites
The information is stored within the classes instance variable imputationStatistics
pgu.imputation$gatherImputationSiteStatistics(data_df = "tbl_df")
data_df
The data frame to be analyzed. (tibble::tibble)
gatherImputationSiteDistribution()
Gathers the distribution of imputation sites within the data frame. The information is stored within the classes instance variable imputationSiteDistribution.
pgu.imputation$gatherImputationSiteDistribution(data_df = "tbl_df")
data_df
The data frame to be analyzed. (tibble::tibble)
A data frame (tibble::tibble)
insertImputationSites()
Takes a dataframe, replaces the imputation sites indicated by the instance variable imputationsites
by NA,
and returns the mutated dataframe.
pgu.imputation$insertImputationSites(data_df = "tbl_df")
data_df
The data frame to be analyzed. (tibble::tibble)
A mutated version of data_df. (tibble::tibble)
one_hot()
Gathers statistical information about missing values in one hot format. The result is stored in the instance variable one_hot_df.
pgu.imputation$one_hot(data_df = "tbl_df")
data_df
The data frame to be analyzed. (tibble::tibble)
analyzeImputationSites()
Takes a dataframe and analyses the imputation sites.
pgu.imputation$analyzeImputationSites(data_df = "tbl_df")
data_df
The data frame to be analyzed. (tibble::tibble)
imputationSiteIdxByFeature()
Returns the position of an attribute's imputation sites within a data frame.
pgu.imputation$imputationSiteIdxByFeature(featureName = "character")
featureName
The attribute's name. (character)
The postion of the imputation sites. (numeric)
nanFeatureList()
Characterizes each row of the data frame as either complete
or indicates which attribute are missing within the row.
If multiple attributes' row entries are missing, the row is characterized by multiple
.
pgu.imputation$nanFeatureList(data_df = "tbl_df")
data_df
The data frame to be analyzed. (tibble::tibble)
Vector of row characteristics. (character)
average_number_of_predictors()
Calculates the average number of predictors for a given dataframe and minpuc and mincor variables using the mice::quickpred routine.
pgu.imputation$average_number_of_predictors( data_df = "tbl_df", minpuc = 0, mincor = 0.1 )
data_df
The dataframe to be analyzed (tibble::tibble)
minpuc
Specifies the minimum threshold for the proportion of usable cases. (numeric)
mincor
Specifies the minimum threshold against which the absolute correlation in the dataframe is compared. (numeric)
Average_number_of_predictors. (numeric)
detectPredictors()
Identifies possible predictors for each feature. Analysis results are written to the instance variable pred_mat. Intermediate analysis results are an influx/outflux dataframe that is written to the instance variable flux_df and detect predictors and a list of features that is excluded from the search for possible predictors that is written to the instance variable exclude_vec.
pgu.imputation$detectPredictors(data_df = "tbl_df")
data_df
The dataframe to be analyzed. (tibble::tibble)
handleImputationSites()
Chooses a cleaning method based upon the instance variable imputationAgent
and handles the imputation sites in the dataframe.
Returns a cleaned data set.
Display the progress if shiny is loaded.
pgu.imputation$handleImputationSites(data_df = "tbl_df", progress = "Progress")
data_df
The data frame to be analyzed. (tibble::tibble)
progress
If shiny is loaded, the analysis' progress is stored within this instance of the shiny Progress class. (shiny::Progress)
Cleaned dataframe. (tibble:tibble)
imputeByMedian()
Substitutes imputation sites by the median of the respective attribute. Returns the cleaned dataframe. Display the progress if shiny is loaded.
pgu.imputation$imputeByMedian(data_df = "tbl_df", progress = "Progress")
data_df
The data frame to be analyzed. (tibble::tibble)
progress
If shiny is loaded, the analysis' progress is stored in this instance of the shiny Progress class. (shiny::Progress)
Cleaned dataframe. (tibble:tibble)
imputeByMean()
Substitutes imputation sites by the aritmertic mean of the respective attribute. Returns the cleaned dataframe. Display the progress if shiny is loaded.
pgu.imputation$imputeByMean(data_df = "tbl_df", progress = "Progress")
data_df
The data frame to be analyzed. (tibble::tibble)
progress
If shiny is loaded, the analysis' progress is stored in this instance of the shiny Progress class. (shiny::Progress)
Cleaned dataframe. (tibble:tibble)
imputeByExpectationValue()
Substitutes imputation sites by the expectation value of the respective attribute. Returns the cleaned dataframe. Display the progress if shiny is loaded.
pgu.imputation$imputeByExpectationValue( data_df = "tbl_df", progress = "Progress" )
data_df
The data frame to be analyzed. (tibble::tibble)
progress
If shiny is loaded, the analysis' progress is stored in this instance of the shiny Progress class. (shiny::Progress)
Cleaned dataframe. (tibble:tibble)
imputeByMC()
Substitutes imputation sites by values generated by a monte carlo simulation.
The procedure runs several times as defined by the instance variable iterations
.
The run with the best result is identified and used for substitution.
Returns the cleaned dataframe.
Display the progress if shiny is loaded.
pgu.imputation$imputeByMC(data_df = "tbl_df", progress = "Progress")
data_df
The data frame to be analyzed. (tibble::tibble)
progress
If shiny is loaded, the analysis' progress is stored in this instance of the shiny Progress class. (shiny::Progress)
Cleaned dataframe. (tibble:tibble)
imputeByKnn()
Substitutes imputation sites by predictions of a KNN analysis of the whole dataframe. Returns the cleaned dataframe. Display the progress if shiny is loaded.
pgu.imputation$imputeByKnn(data_df = "tbl_df", progress = "Progress")
data_df
The data frame to be analyzed. (tibble::tibble)
progress
If shiny is loaded, the analysis' progress is stored in this instance of the shiny Progress class. (shiny::Progress)
Cleaned dataframe. (tibble:tibble)
imputeByMice()
Substitutes imputation sites by values generated by a different methods of the mice package.
The procedure runs several times as defined by the instance variable iterations
.
The run with the best result is identified and used for substitution.
Returns the cleaned dataframe.
Display the progress if shiny is loaded.
pgu.imputation$imputeByMice(data_df, progress = "Progress")
data_df
The data frame to be analyzed. (tibble::tibble)
progress
If shiny is loaded, the analysis' progress is stored in this instance of the shiny Progress class. (shiny::Progress)
Cleaned dataframe. (tibble:tibble)
imputeByM5P()
Substitutes imputation sites by predictions of a M5P tree trained on the whole dataframe. Returns the cleaned dataframe. Display the progress if shiny is loaded.
pgu.imputation$imputeByM5P(data_df = "tbl_df", progress = "Progress")
data_df
The data frame to be analyzed. (tibble::tibble)
progress
If shiny is loaded, the analysis' progress is stored in this instance of the shiny Progress class. (shiny::Progress)
Cleaned dataframe. (tibble:tibble)
imputationSiteHeatMap()
Displays the distribution of missing values in form of a heatmap.
pgu.imputation$imputationSiteHeatMap()
A heatmap plot. (ggplot2::ggplot)
featureBarPlot()
Displays the distribution of an attribute values as histogram.
pgu.imputation$featureBarPlot(data_df = "tbl_df", feature = "character")
data_df
dataframe to be analyzed. (tibble::tibble)
feature
attribute to be shown. (character)
A histogram. (ggplot2::ggplot)
featureBoxPlotWithSubset()
Displays the distribution of an attribute's values as box plot.
pgu.imputation$featureBoxPlotWithSubset( data_df = "tbl_df", feature = "character" )
data_df
dataframe to be analyzed. (tibble::tibble)
feature
attribute to be shown. (character)
A box plot. (ggplot2::ggplot)
featurePlot()
Displays the distribution of an attribute's values as a composition of a box plot and a histogram.
pgu.imputation$featurePlot(data_df = "tbl_df", feature = "character")
data_df
dataframe to be analyzed. (tibble::tibble)
feature
attribute to be shown. (character)
A composite plot. (ggplot2::ggplot)
fluxPlot()
Displays an influx/outflux plot
pgu.imputation$fluxPlot()
A composite plot. (ggplot2::ggplot)
clone()
The objects of this class are cloneable with this method.
pgu.imputation$clone(deep = FALSE)
deep
Whether to make a deep clone.
Sebastian Malkusch, malkusch@med.uni-frankfurt.de
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.