Description Usage Arguments Value Author(s) See Also Examples
View source: R/mice.factorize.R
This function acts as the counterpart to mice.binarize
, as it
effectively retransforms imputations of binarized data that mice
has
been run on and that has been post-processed via mice.post.matching
after. The post-processing is usually necessary as mice
is very likely
to impute multiple ones among the dummy columns belonging to to a single
factor entry. The resulting mice::mids
object is not suited for further
mice.mids()
iterations or the use of plot
, but works well as
input to with()
.
1 | mice.factorize(obj, par_list)
|
obj |
|
par_list |
List that has been returned in a previous call of
|
A mice::mids
object in which data and imputations have been
retransformed from their respective binarized versions in the input
obj
. As this isn't a proper result of a mice iteration and many of
the attributes of obj
cannot be transformed well, only the slots
data
, nmis
, where
and imp
, which are needed in
with()
, are not NULL
. In particular, it would not work as
input for mice.mids()
.
Tobias Schumacher, Philipp Gaffert
mice.binarize
,
mice.post.matching
, mice
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 | ## Not run:
#------------------------------------------------------------------------------
# Example that illustrates the combined functionalities of mice.binarize(),
# mice.factorize() and mice.post.matching() on the data set 'boys_data', which
# contains the column blocks ('hgt','bmi') and ('hc','gen','phb') that have
# identical missing value patterns, and out of which the columns 'gen' and
# 'phb' are factors. We are going to impute both tuples blockwise, while
# binarizing the factor columns first. Note that we never need to specify any
# blocks or columns to binarize, as these are all determined automatically
#------------------------------------------------------------------------------
# By default, mice.binarize() expands all factor columns that contain NAs,
# so the columns 'gen' and 'phb' are automatically binarized
boys_bin <- mice.binarize(boys_data)
# Run mice on binarized data, note that we need to use boys_bin$data to grab
# the actual binarized data and that we use the output predictor matrix
# boys_bin$pred_matrix which is recommended for obtaining better imputation
# models
mids_boys <- mice(boys_bin$data, predictorMatrix = boys_bin$pred_matrix)
# It is very likely that mice imputed multiple ones among one set of dummy
# variables, so we need to post-process. As recommended, we also use the output
# weights from mice.binarize(), which yield a more balanced weighting on the
# column tuple ('hc','gen','phb') within the matching. As in previous examples,
# both tuples are automatically discovered and imputed on
post_boys <- mice.post.matching(mids_boys, weights = boys_bin$weights)
# Now we can safely retransform to the original data, with non-binarized
# imputations
res_boys <- mice.factorize(post_boys$midsobj, boys_bin$par_list)
# Analyze the distribution of imputed variables, e.g. of the column 'gen',
# using the mice version of with()
with(res_boys, table(gen))
#------------------------------------------------------------------------------
# Similar example to the previous, that also works on 'boys_data' and
# illustrates some more advanced funtionalities of all three functions in miceExt:
# This time we only want to post-process the column block ('gen','phb'), while
# weighting the first of these tuples twice as much as the second. Within the
# matching, we want to avoid matrix computations by using the euclidian distance
# to determine the donor pool, and we want to draw from three donors only.
#------------------------------------------------------------------------------
# Binarize first, we specify blocks in list format with a single block, so we
# can omit an enclosing list. Similarly, we also specify weights in list format.
# Both blocks and weights will be expanded and can be accessed from the output
# to use them in mice.post.matching() later
boys_bin <- mice.binarize(boys_data,
blocks = c("gen", "phb"),
weights = c(2,1))
# Run mice on binarized data, again use the output predictor matrix from
# mice.binarize()
mids_boys <- mice(boys_bin$data, predictorMatrix = boys_bin$pred_matrix)
# Post-process the binarized columns. We use blocks and weights from the previous
# output, and set 'distmetric' and 'donors' as announced in the example
# description
post_boys <- mice.post.matching(mids_boys,
blocks = boys_bin$blocks,
weights = boys_bin$weights,
distmetric = "euclidian",
donors = 3L)
# Finally, we can retransform to the original format
res_boys <- mice.factorize(post_boys$midsobj, boys_bin$par_list)
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.