Description Usage Arguments Value Examples
Imputations are generated using nonparametric Bayesian joint models (specifically the hierarchcially coupled mixture model with local dependence described in Murray and Reiter (2015); see citation(MixedDataImpute) or http://arxiv.org/abs/1410.0438).
1 2 | hcmm_impute(X, Y, kz, kx, ky, hyperpar = NULL, num.impute, num.burnin,
num.skip, thin.trace = -1, status = 50)
|
X |
A data frame of categorical variables (as factors) |
Y |
A matrix or data frame of continuous variables |
kz |
Number of top-level clusters |
kx |
Number of X-model clusters |
ky |
Number of Y-model clusters |
hyperpar |
A list of hyperparameter values (see |
num.impute |
Number of imputations |
num.burnin |
Number of MCMC burn-in iterations |
num.skip |
Number of MCMC iterations between saved imputations |
thin.trace |
If negative, only save the num.impute datasets. If positive,
save summaries of the model state at every |
status |
Interval at which to print status messages |
A list with three elements:
imputations
A list of length num.impute
. Each element is an imputed dataset.
trace
MCMC output (currently the component sizes for the three mixture indices)
model
An interface to the C++ object containing the current state
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 | ## Not run:
library(MixedDataImpute)
library(mice) # For the functions implementing combining rules
data(sipp08)
set.seed(1)
n = 1000
s = sample(1:nrow(sipp08), n)
Y = sipp08[s,1:2]
Y[,1] = log(Y[,1]+1)
X = sipp08[s,-c(1:2,9)] # Also removes occ code, which has ~23 levels
# MCAR with probability 0.2, for illustration purposes (not matching the paper)
Y[runif(n)<0.2,1] = NA
Y[runif(n)<0.2,2] = NA
for(j in 1:ncol(X)) X[runif(n)<0.2,j] = NA
kz = 15
ky = 60
kx = 90
num.impute = 5
num.burnin = 10000
num.skip = 1000
thin.trace = 10
imp = hcmm_impute(X, Y, kz=kz, kx=kx, ky=ky,
num.impute=num.impute, num.burnin=num.burnin,
num.skip=num.skip, thin.trace=thin.trace)
# Example of getting MI estimates for a regression, using the
# pooling functions in mice
form = total_earnings~age+I(age^2) + sex*I(own_kid!=0)
fits = lapply(imp$imputations, function(dat) lm(form, data=dat))
pooled_ests = pool(as.mira(fits))
summary(pooled_ests)
# original, complete data estimates for comparison
comdat = sipp08[s,]
comdat[,1] = log(comdat[,1]+10)
summary(lm(form, data=comdat))
# true population values for comparison
pop = sipp08
pop[,1] = log(pop[,1]+10)
summary(lm(form, data=pop))
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.