pool.mi: Pooling Multiple Imputation Results
In miWQS: Multiple Imputation Using Weighted Quantile Sum Regression

Description Usage Arguments Details Value Note References Examples

Combines multiple parameter estimates (as used in MI) across the K imputed datasets using Rubin 1996 / 1987 formulas, including: calculating a pooled mean, standard error, missing data statistics, confidence intervals, and p-values.

pool.mi(
  to.pool,
  n = 999999,
  method = c("smallsample", "rubin"),
  alpha = 0.05,
  prt = TRUE,
  verbose = FALSE
)

`to.pool`	An array of p x 2 x K, where p is the number of parameters to be pooled, 2 refers to the parameters of mean and standard deviation, and K imputation draws. The rownames of to.pool are kept in the results.
`n`	A number providing the sample size, which is used in calculating the degrees of freedom. If nothing is specified, a large sample is assumed. Has no effect if `K` = 1.
`method`	A string to indicate the method to calculate the degrees of freedom, df.t. If method = "smallsample" (the default) then the Barnard-Rubin adjustment for small degrees of freedom is used. Otherwise, the method from Rubin (1987) is used.
`alpha`	Type I error used to form the confidence interval. Default: 0.05.
`prt`	Boolean variable for printing out the standard output. If TRUE, selective parts of a pool.mi object are printed to the screen in an understandable fashion.
`verbose`	Logical; if TRUE, prints more information. Useful to check for any errors in the code. Default: FALSE.

Stage 3 of Multiple Imputation: We assume that each complete-data estimate is normally distributed. But, given incomplete data, we assume a t-distribution, which forms the basis for confidence intervals and hypothesis tests.

The input is an array with p rows referring to the number of parameters to be combined. An estimate and within standard error forms the two columns of the array, which can be easily be taken as the first two columns of the coefficients element of the summary of a glm/lm object. The last dimension is the number of imputations, K. See dataset wqs.pool.test as an example.

Uses Rubin's rules to calculate the statistics of an imputed dataset including: the pooled mean, total standard error, a relative increase in variance, the fraction of missing information, a (1-alpha)

A data-frame is returned with the following columns:

pooled.mean: The pooled univariate estimate, Qbar, formula (3.1.2) Rubin (1987).
pooled.total.se: The total standard error of the pooled estimate, formula (3.1.5) Rubin (1987).
pooled.total.var: The total variance of the pooled estimate, formula (3.1.5) Rubin (1987).
se.within: The standard error of mean of the variances (i.e. the pooled within-imputation variance), formula (3.1.3) Rubin (1987).
se.between: The between-imputation standard error, square root of formula (3.1.4) Rubin (1987).
relative.inc.var(r): The relative increase in variance due to nonresponse, formula (3.1.7) Rubin (1987).
proportion.var.missing(lambda): The proportion of variation due to nonresponse, formula (2.24) Van Buuren (2012).
frac.miss.info: The fraction missing information due to nonresponse, formula (3.1.10) Rubin (1987).
df.t: The degrees of freedom for the reference t-distribution, formula (3.1.6) Rubin (1987) or method of Barnard-Rubin (1999) (if method = "smallsample" (default)).
CI: The (1-alpha)% confidence interval (CI) for each pooled estimate.
p.value: The p-value used to test significance.

Modified the pool.scalar (version R 3.4) in the mice package to handle multiple parameters at once in an array and combine them. Similar to mi.inference in the norm package, but the small-sample adjustment is missing.

Rubin, D. B. (1987). Multiple Imputation for nonresponse in surveys. New York: Wiley.

Rubin, D. B. (1996). Multiple Imputation After 18+ Years. Journal of the American Statistical Association, 91(434), 473–489. https://doi.org/10.2307/2291635.

Barnard, J., & Rubin, D. B. (1999). Small-Sample Degrees of Freedom with Multiple Imputation. Biometrika, 86(4), 948–955.

#### Example 1: Sample Dataset 87, using 10% BDL Scenario
data(wqs.pool.test)
# Example of the `to.pool` argument
head(wqs.pool.test)

# Pool WQS results and decrease in order of weights.
wqs.results.pooled <-   pool.mi(wqs.pool.test, n = 1000)
weight.dec <- c(order(wqs.results.pooled$pooled.mean[1:14], decreasing = TRUE), 15:16)
wqs.results.pooled <-  wqs.results.pooled[weight.dec, ]
wqs.results.pooled


# When there is 1 estimate (p = 1)
a <- pool.mi(wqs.pool.test[1, , , drop = FALSE], n = 1000)
a
# wqs.results.pooled["dieldrin", ]

# For single imputation (K = 1):
b <- pool.mi(wqs.pool.test[, , 1, drop = FALSE], n = 1000)
b

# Odds ratio and 95% CI using the CLT.
odds.ratio <- exp(wqs.results.pooled[15:16, c("pooled.mean", "CI.1", "CI.2")])
## makeJournalTables :: format.CI(odds.ratio, trim = TRUE, digits = 2, nsmall = 2)
odds.ratio

#  The mice package is suggested for the examples, but not needed for the function.