imputationcycle: Cycling

Description Usage Arguments Value

View source: R/hmi_cycle.R

Description

Function to do one imputation cycle on the given data. The function cycles through every variable sequentially imputing the values, that are NA in the original data set in that current variable. The function determines the type of the variable and calls the suitable imputation function.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
imputationcycle(
  data_before,
  original_data,
  NA_locator,
  fe,
  interaction_names,
  list_of_types,
  nitt,
  burnin,
  thin,
  pvalue = 0.2,
  mn,
  k = Inf,
  spike = NULL,
  rounding_degrees = NULL,
  rounding_covariates
)

Arguments

data_before

The n x p data.frame with the variables to impute. It was prepared for imputation in the wrapper function. The preparation includes the adding of intercept variables or interactions or the joining of small clusters.

original_data

The original data.frame the user passed to hmi.

NA_locator

A n x p matrix localizing the missing values in the original dataset. The elements are TRUE if the original data are missing and FALSE if the are observed.

fe

A list with the decomposed elements of the model_formula.

interaction_names

A list with the names of the variables that have been generated as interaction variables

list_of_types

a list where each list element has the name of a variable in the data.frame. The elements have to contain a single character denoting the type of the variable. See get_type for details about the variable types. With the function list_of_types_maker, the user can get the framework for this object. In most scenarios this is should not be necessary. One example where it might be necessary is when only two observations of a continuous variable are left - because in this case get_type interpret is variable to be binary. Wrong is it in no case.

nitt

An integer defining number of MCMC iterations (see MCMCglmm).

burnin

burnin A numeric value between 0 and 1 for the desired percentage of Gibbs samples that shall be regarded as burnin.

thin

An integer to set the thinning interval range. If thin = 1, every iteration of the Gibbs-sampling chain will be kept. For highly autocorrelated chains, that are only examined by few iterations (say less than 1000), the geweke.diag might fail to detect convergence. In such cases it is essential to look a chain free from autocorrelation.

pvalue

A numeric between 0 and 1 denoting the threshold of p-values a variable in the imputation model should not exceed. If they do, they are excluded from the imputation model.

mn

An integer defining the minimum number of individuals per cluster.

k

An integer defining the allowed maximum of levels in a factor covariate.

spike

A numeric value saying which value in the semi-continuous data might be the spike. Or a list with with such values and names identical to the variables with spikes (see list_of_spikes_maker for details).

rounding_degrees

A numeric vector with the presumed rounding degrees. Or a list with rounding degrees, where each list element has the name of a rounded continuous variable. Such a list can be generated using list_of_rounding_degrees_maker(data).

rounding_covariates

A list for each rounded continuous variable with a character vector containing the covariate names from the original rounding formula. The transformation takes place in the wrapper function.

Value

A data.frame where the values, that have a missing value in the original dataset, are imputed.


hmi documentation built on Oct. 23, 2020, 7:31 p.m.