gcImp: Gaussian Copula Imputations

Description Usage Arguments Value Examples

Description

gcImp generates multiple imputations using Peter Hoff's sbgcop package. Remember to remove any ID variables from the data matrix input! After generating the imputations, use the gc.as.mire to convert the imputations into a "mira" format. We can than use the mice package to carry out all subsequent analysis.

Usage

1
gcImp(dt, m = 20, burn = 300, nsamp = 1000, ...)

Arguments

dt

A data matrix with missing values. Remember to remove any ID variables!

m

A positive integer indicating the number of imputations. Note that m < (nsamp - burn) otherwise there will not be enough samples to generate the imputations.

burn

A positive integer indicating the burn in. Default value is 300.

nsamp

A positive integer indicating the number of samples from the MCMC chain. Default value is 1000.

...

Other arguments to pass to sbgcop.mcmc.

Value

An object of class 'gcImp' containing the following components:

dt

The original data set.

m

The number of imputations.

resp

The response indicators: a binary matrix of the same dimensions as dt taking a value 1 if the corresponding outcome is observed and 0 if it is missing.

nmis

The number of missing values per variable.

imp

A list of length "m" containing the imputed values.

method

The method used to generated imputations: "Gaussian Copula."

sbgcop.out

The output from sbgcop.out.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
N <- 200
rho <- matrix(0.3, 2, 2)
diag(rho) <- 1
# Compute the Choleski decomposition of rho. 
rho.chol <- chol(rho)
# Generate imputations
samples.mvn <- matrix(rnorm(N * 2), ncol = 2) %*% rho.chol
# Delete some of the values
p <- rep(0.2, N)
# p <- 1/(1 + exp(0.2 - samples.mvn[, 1]))
R <- sapply(1:N, function(jj) sample(c(NA, 1), size = 1,
                                     prob = c(p[jj], 1 - p[jj])))
# Generate the observed data
samples.mvn[, 2] <- samples.mvn[, 2] * R
out <- gcImp(samples.mvn)
print(out)
# Check the trace plot of the latent correlation 
plot(out$sbgcop.out$C.psamp[1,2,], type = "l")
# Check the trace plot of the mean of the imputed data
plot(colMeans(out$sbgcop.out$Y.imput[,2,]), type = "l")
## Not run: # Further checks can be performed using the mcmcplots package
library(mcmcplots)
mcmcplots::mcmcplot(colMeans(out$sbgcop.out$Y.imput[,2,]))
mcmcplots::mcmcplot(out$sbgcop.out$C.psamp[1,2,])

## End(Not run) 
# Convert to a class "mira" and use MICE for analysis
imp <- gc.as.mids(out)
# Stack the imputations and run a linear regression
stacked <- mice::complete(imp, "long")
fit <- lm(V1 ~ V2, data = stacked)
coef(fit)
# Fir separate regressions and combine the output using Rubin's rules
fit <- with(imp, lm(V1 ~ V2))
est <- mice::pool(fit)
summary(est)
# For version of mice > 2.6 we can also plot using the lattice package
## Not run: 
lattice::densityplot(imp)
lattice::bwplot(imp)
## End(Not run)

bojinov/gcImp documentation built on May 29, 2019, 10:35 a.m.