pool.scalar | R Documentation |
Pools univariate estimates of m repeated complete data analysis
pool.scalar(Q, U, n = Inf, k = 1, rule = c("rubin1987", "reiter2003"))
pool.scalar.syn(Q, U, n = Inf, k = 1, rule = "reiter2003")
Q |
A vector of univariate estimates of |
U |
A vector containing the corresponding |
n |
A number providing the sample size. If nothing is specified,
an infinite sample |
k |
A number indicating the number of parameters to be estimated.
By default, |
rule |
A string indicating the pooling rule. Currently supported are
|
The function averages the univariate estimates of the complete data model, computes the total variance over the repeated analyses, and computes the relative increase in variance due to missing data or data synthesisation and the fraction of missing information.
Returns a list with components.
m
:Number of imputations.
qhat
:The m
univariate estimates of repeated complete-data analyses.
u
:The corresponding m
variances of the univariate estimates.
qbar
:The pooled univariate estimate, formula (3.1.2) Rubin (1987).
ubar
:The mean of the variances (i.e. the pooled within-imputation variance), formula (3.1.3) Rubin (1987).
b
:The between-imputation variance, formula (3.1.4) Rubin (1987).
t
:The total variance of the pooled estimated, formula (3.1.5) Rubin (1987).
r
:The relative increase in variance due to nonresponse, formula (3.1.7) Rubin (1987).
df
:The degrees of freedom for t reference distribution by the method of Barnard-Rubin (1999).
fmi
:The fraction missing information due to nonresponse, formula (3.1.10) Rubin (1987). (Not defined for synthetic data.)
Karin Groothuis-Oudshoorn and Stef van Buuren, 2009; Thom Volker, 2021
Rubin, D.B. (1987). Multiple Imputation for Nonresponse in Surveys. New York: John Wiley and Sons.
Reiter, J.P. (2003). Inference for Partially Synthetic, Public Use Microdata Sets. Survey Methodology, 29, 181-189.
pool
# missing data imputation with with manual pooling
imp <- mice(nhanes, maxit = 2, m = 2, print = FALSE, seed = 18210)
fit <- with(data = imp, lm(bmi ~ age))
# manual pooling
summary(fit$analyses[[1]])
summary(fit$analyses[[2]])
pool.scalar(Q = c(-1.5457, -1.428), U = c(0.9723^2, 1.041^2), n = 25, k = 2)
# check: automatic pooling using broom
pool(fit)
# manual pooling for synthetic data created from complete data
imp <- mice(cars,
maxit = 2, m = 2, print = FALSE, seed = 18210,
where = matrix(TRUE, nrow(cars), ncol(cars))
)
fit <- with(data = imp, lm(speed ~ dist))
# manual pooling: extract Q and U
summary(fit$analyses[[1]])
summary(fit$analyses[[2]])
pool.scalar.syn(Q = c(0.12182, 0.13209), U = c(0.02121^2, 0.02516^2), n = 50, k = 2)
# check: automatic pooling using broom
pool.syn(fit)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.