knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "man/figures/README-",
  out.width = "100%"
)
options(scipen = 100)

mice.mcerror

Codecov test coverage R-CMD-check Lifecycle: experimental

The {mice.mcerror} package is an add-on package to {mice} that can be used to calculate Monte Carlo error estimates for statistics computed using multiply imputed data.

Note that the Monte Carlo error estimate of an MI statistic is defined as the standard error of the mean of the pseudo-values for that statistic, computed by omitting one imputation at a time (i.e., using the jackknife).

Warning: this package is still experimental, so please test it out and find where it breaks!

Installation

You can install the development version of {mice.mcerror} from GitHub with:

# install.packages("devtools")
devtools::install_github("ellessenne/mice.mcerror")

Workflow

We replicate the following example from Stata:

. use http://www.stata-press.com/data/r14/mheart1s20

. mi estimate, dots mcerror: logit attack smokes age bmi hsgrad female

Imputations (20):
  .........10.........20 done

Multiple-imputation estimates                   Imputations       =         20
Logistic regression                             Number of obs     =        154
                                                Average RVI       =     0.0312
                                                Largest FMI       =     0.1355
DF adjustment:   Large sample                   DF:     min       =   1,060.38
                                                        avg       = 223,362.56
                                                        max       = 493,335.88
Model F test:       Equal FMI                   F(   5,71379.3)   =       3.59
Within VCE type:          OIM                   Prob > F          =     0.0030

------------------------------------------------------------------------------
      attack |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
      smokes |   1.198595   .3578195     3.35   0.001     .4972789    1.899911
             |   .0068541   .0008562     0.01   0.000     .0056572    .0082212
             |
         age |   .0360159   .0154399     2.33   0.020     .0057541    .0662776
             |   .0002654   .0000351     0.01   0.001     .0002319    .0003108
             |
         bmi |   .1039416   .0476136     2.18   0.029      .010514    .1973692
             |   .0038014   .0008904     0.09   0.006     .0039928    .0044049
             |
      hsgrad |   .1578992   .4049257     0.39   0.697    -.6357464    .9515449
             |   .0091517   .0010209     0.02   0.016     .0086215    .0100602
             |
      female |  -.1067433   .4164735    -0.26   0.798    -.9230191    .7095326
             |   .0077566   .0009279     0.02   0.015      .006985    .0088408
             |
       _cons |  -5.478143   1.685075    -3.25   0.001    -8.782394   -2.173892
             |   .1079841   .0248274     0.07   0.000     .1310618    .1050817
------------------------------------------------------------------------------
Note: Values displayed beneath estimates are Monte Carlo error estimates.

Using the {mice.mcerror} it is possible to replicate this. First, let's fit all models as you would do using {mice}:

library(mice)
library(mice.mcerror)

data("mheart1s20.mice")
fit <- with(
  data = mheart1s20.mice,
  expr = glm(
    formula = attack ~ smokes + age + bmi + hsgrad + female,
    family = binomial(link = "logit")
  )
)

The pooled estimates can be obtained by using the summary() and pool() functions:

summary(pool(fit), conf.int = TRUE)

Analogously, Monte Carlo errors can be computed using the summary() and mcerror() functions:

summary(mcerror(fit, conf.int = TRUE))

Please note that with mcerror() we need to use the argument conf.int = TRUE in the inner function, as we need to compute confidence intervals for each jackknife replicate.

Calling just pool() and mcerror() returns a larger set of imputation statistics:

pool(fit)
mcerror(fit, conf.int = TRUE)

Compare this with results from Stata (displayed above) and see that they should be the same!

Additional statistics can be obtained in Stata as follows:

. mi estimate, vartable mcerror nocitable

Multiple-imputation estimates                   Imputations       =         20
Logistic regression

Variance information
------------------------------------------------------------------------------
             |        Imputation variance                             Relative
             |    Within   Between     Total       RVI       FMI    efficiency
-------------+----------------------------------------------------------------
      smokes |   .127048    .00094   .128035   .007765   .007711       .999615
             |   .000559   .000211   .000613   .001744    .00172        .00009
             |                                                  
         age |   .000237   1.4e-06   .000238   .006245    .00621        .99969
             |   8.6e-07   4.6e-07   1.1e-06   .002054   .002033       .000107
             |                                                  
         bmi |   .001964   .000289   .002267   .154545   .135487       .993271
             |   .000026   .000077   .000085    .04134   .031986        .00166
             |                                                  
      hsgrad |   .162206   .001675   .163965   .010843   .010739       .999463
             |   .000521   .000552   .000827   .003579   .003516       .000185
             |                                                  
      female |   .172187   .001203    .17345   .007338    .00729       .999636
             |   .000614   .000297   .000773   .001811   .001788       .000094
             |                                                  
       _cons |    2.5946   .233211   2.83948   .094377   .086953       .995671
             |   .029651   .070081   .083436   .028332   .024216       .001263
------------------------------------------------------------------------------
Note: Values displayed beneath estimates are Monte Carlo error estimates.


ellessenne/mice.mcerror documentation built on May 5, 2022, 12:34 a.m.