View source: R/impute_iterative.R
impute_iterative | R Documentation |
Iterative imputation of a data set
impute_iterative( ds, model_spec_parsnip = linear_reg(), model_fun_unsupervised = NULL, predict_fun_unsupervised = NULL, max_iter = 10, stop_fun = NULL, initial_imputation_fun = NULL, cols_used_for_imputation = "only_complete", cols_order = seq_len(ncol(ds)), rows_used_for_imputation = "only_complete", rows_order = seq_len(nrow(ds)), update_model = "every_iteration", update_ds_model = "every_iteration", stop_fun_args = NULL, M = is.na(ds), model_arg = NULL, warn_incomplete_imputation = TRUE, ... )
ds |
The data set to be imputed. Must be a data frame with column names. |
model_spec_parsnip |
The model type used for supervised imputation (see
( |
model_fun_unsupervised |
An unsupervised model function (see
|
predict_fun_unsupervised |
A predict function for unsupervised
imputation (see |
max_iter |
Maximum number of iterations |
stop_fun |
A stopping function (see details below) or |
initial_imputation_fun |
This function will do the initial imputation of
the missing values. If |
cols_used_for_imputation |
Which columns should be used to impute other columns? Possible choices: "only_complete", "already_imputed", "all" |
cols_order |
Ordering of the columns for imputation. This can be a
vector with indices or an |
rows_used_for_imputation |
Which rows should be used to impute other rows? Possible choices: "only_complete", "partly_complete", "complete_in_k", "already_imputed", "all_except_i", "all" |
rows_order |
Ordering of the rows for imputation. This can be a vector
with indices or an |
update_model |
How often should the model for imputation be updated? |
update_ds_model |
How often should the data set for the inner model be updated? |
stop_fun_args |
Further arguments passed on to |
M |
Missing data indicator matrix |
model_arg |
Further arguments for |
warn_incomplete_imputation |
Should a warning be given, if the
returned data set still contains |
... |
Further arguments passed on to |
This function impute a data set in an iterative way. Internally, either
impute_supervised()
or impute_unsupervised()
is used, depending on the
values of model_spec_parsnip
, model_fun_unsupervised
and
predict_fun_unsupervised
. If you want to use a supervised inner method,
model_spec_parsnip
must be specified and model_fun_unsupervised
and
predict_fun_unsupervised
must both be NULL
. For an unsupervised inner
method, model_fun_unsupervised
and predict_fun_unsupervised
must be
specified and model_spec_parsnip
must be NULL
. Some arguments of this
function are only meaningful for impute_supervised()
or
impute_unsupervised()
.
an imputed data set (or a return value of stop_fun
)
The stop_fun
should take the arguments
ds
(the data set imputed in the current iteration)
ds_old
(the data set imputed in the last iteration)
a list (with named elements M
, nr_iterations
, max_iter
)
stop_fun_args
res_stop_fun
(the return value of stop_fun
from the last iteration.
Initial value for the first iteration: list(stop_iter = FALSE)
)
in this order.
To allow for a next iteration, the stop_fun
must return a list which
contains the named element stop_iter = FALSE
. The simple return
list(stop_iter = FALSE)
will allow the iteration to continue. However,
the list can include more information which are handed over to stop_fun
in the next iteration. For example, the return value
list(stop_iter = FALSE, last_eps = 0.3)
would also lead to another
iteration. If stop_fun
does not return a list or the list does not
contain stop_iter = FALSE
the iteration is stopped and the return value
of stop_fun
is returned as result of impute_iterative()
. Therefore,
this return value should normally include the imputed data set ds
or
ds_old
.
An example for a stop_fun
is stop_ds_difference()
.
impute_supervised()
and impute_unsupervised()
as the workhorses for
the imputation.
stop_ds_difference()
as an example of a stop function.
set.seed(123) # simple example ds_mis <- missMethods::delete_MCAR( data.frame(X = rnorm(20), Y = rnorm(20)), 0.2, 1 ) impute_iterative(ds_mis, max_iter = 2) # using pre-imputation ds_mis <- missMethods::delete_MCAR( data.frame(X = rnorm(20), Y = rnorm(20)), 0.2 ) impute_iterative( ds_mis, max_iter = 2, initial_imputation_fun = missMethods::impute_mean ) # example using stop_ds_difference() as stop_fun ds_mis <- missMethods::delete_MCAR( data.frame(X = rnorm(20), Y = rnorm(20)), 0.2 ) ds_imp <- impute_iterative( ds_mis, initial_imputation_fun = missMethods::impute_mean, stop_fun = stop_ds_difference, stop_fun_args = list(eps = 0.5) ) attr(ds_imp, "nr_iterations")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.