miceExt: miceExt: Extension Package to mice
In miceExt: Extension Package to 'mice'

This package extends and builds on the mice package by adding a functionality to perform multivariate predictive mean matching on imputed data as well as new functionalities to perform predictive mean matching on factor variables.

The mice package, which was implemented and published by Stef van Buuren and Karin Groothuis-Oudshoorn in 2001 and has been further developed ever since, is one of most extensive and most commonly used implementations of multiple imputation within R. Despite its many years of refinement however, there are still some missing data problems that mice does not handle very well, and two of these have now been addressed within the implementation of this package.
First, mice does not provide any option to perform imputation on multiple columns at once, which can, for instance, result in nonsensical output imputations when there are causal relationships between the corresponding attributes, e.g. a 15-year-old person that has a driver's license.
Further, mice still struggles with imputing categorical data, as many internally used imputation methods either are not suited for this kind of data at all or do not necessarily converge to the optimal solution.
Overall, miceExt provides three functions, namely

mice.post.matching(),
mice.binarize(),
mice.factorize(),

out of which the first function post-processes results of the mice()-algorithm by performing multivariate predictive mean matching on a user-defined set of column tuples, and results in imputations that are always equal to already-observed values, which annihilates the chance of getting unrealistic output values.
The latter two functions tackle the second issue by even extending the functionality of mice.post.matching(). The function mice.binarize() transforms categorical attributes of a given data frame into a binary dummy representation, which results in an exclusively numerical data set that mice can handle well. Inconsistencies within the imputed dummy columns can then be handled by mice.post.matching(), and mice.factorize() finally serves the purpose of retransforming the imputed binary data into the corresponding original categories, resulting in a proper imputation of the given categorical data.

Tobias Schumacher, Philipp Gaffert, Stef van Buuren, Karin Groothuis-Oudshoorn

mice.post.matching, mice.binarize, mice.factorize, mice

miceExt documentation built on March 18, 2018, 1:18 p.m.