hmi: hmi: Hierarchical Multilevel Imputation.
In hmi: Hierarchical Multiple Imputation

Description Usage Arguments Value References Examples

The user has to pass his data to the function. Optionally he passes his analysis model formula so that hmi runs the imputation model in line with his analysis model formula.
And of course he can specify some parameters for the imputation routine (like the number of imputations and iterations and the number of iterations within the Gibbs sampling).

hmi(
  data,
  model_formula,
  family,
  additional_variables,
  list_of_types,
  m = 5,
  maxit,
  nitt = 22000,
  burnin = 2000,
  pvalue = 1,
  mn = 1,
  k,
  spike,
  heap,
  rounding_degrees,
  rounding_formula = ~.,
  pool_with_mice = TRUE
)

`data`	A `data.frame` with all variables appearing in `model_formula`.
`model_formula`	A `formula` used for the analysis model. Currently the package is designed to handle formula used in `lm`, `glm`, `lmer` and `glmer`. The formula is also used for the default pooling.
`family`	To improve the default pooling, a family object supported by `glm` (resp. `glmer`) an be given. See `family` for details.
`additional_variables`	A character with names of variables (separated by "+", like "x8+x9") that should be included in the imputation model as fixed effects variables, but not in the analysis model. An alternative would be to include such variable names into the `model_formula` and run a reduced analysis model with `hmi_pool` or the functions provide by `mice`.
`list_of_types`	a list where each list element has the name of a variable in the data.frame. The elements have to contain a single character denoting the type of the variable. See `get_type` for details about the variable types. With the function `list_of_types_maker`, the user can get the framework for this object. In most scenarios this is should not be necessary. One example where it might be necessary is when only two observations of a continuous variable are left - because in this case `get_type` interpret is variable to be binary. Wrong is it in no case.
`m`	An integer defining the number of imputations that should be made.
`maxit`	An integer defining the number of times the imputation cycle (imputing x_1\|x_{-1} then x_2\|x_{-2}, ... and finally x_p\|x_{-p}) shall be repeated. The task of checking convergence is left to the user, by evaluating the chainMean and chainVar!
`nitt`	An integer defining number of MCMC iterations (see `MCMCglmm`).
`burnin`	An integer for the desired number of Gibbs samples that shall be regarded as burnin.
`pvalue`	A numeric between 0 and 1 denoting the threshold of p-values a variable in the imputation model should not exceed. If they do, they are excluded from the imputation model.
`mn`	An integer defining the minimum number of individuals per cluster.
`k`	An integer defining the allowed maximum of levels in a factor covariate.
`spike`	A numeric value saying which value in the semi-continuous data might be the spike. Or a list with with such values and names identical to the variables with spikes (see `list_of_spikes_maker` for details.) In versions earlier to 0.9.0 it was called `heap`.
`heap`	Use spike instead. `heap` is only included due to backwards compatibility and will be removed with version 1.0.0
`rounding_degrees`	A numeric vector with the presumed rounding degrees of rounded variables. Or a list with rounding degrees, where each list element has the name of a rounded continuous variable. Such a list can be generated using `list_of_rounding_degrees_maker(data)`. Note: it is presupposed that the rounding degrees include 1 meaning that there is a positive probability that e.g. 3500 was only rounded to the nearest integer (and not rounded to the nearest multiple of 100 or 500).
`rounding_formula`	A formula with the model formula for the latent rounding tendency G. Or a list with model formulas for G, where each list element has the name of a rounded continuous variable. Such a list can be generated using `list_of_rounding_formulas_maker(data)`
`pool_with_mice`	A Boolean indicating whether the user wants to pool the `m` data sets by mice using his `model_formula`. The default value is `FALSE` because this tampers the `mids` object as it adds an argument `pooling` not found in "normal" `mids` objects generated by `mice`.

The function returns a mids object. See mice for further information.

Matthias Speidel, Joerg Drechsler and Shahab Jolani (2020): "The R Package hmi: A Convenient Tool for Hierarchical Multiple Imputation and Beyond", Journal of Statistical Software, Vol. 95, No. 9, p. 1–48, http://dx.doi.org/10.18637/jss.v095.i09

## Not run: 
 data(Gcsemv, package = "hmi")

 model_formula <- written ~ 1 + gender + coursework + (1 + gender|school)

 set.seed(123)
 dat_imputed <- hmi(data = Gcsemv, model_formula = model_formula, m = 2, maxit = 2)
 #See ?hmi_pool for how to pool results.

## End(Not run)