pool.table: Combines estimates from a tidy table

View source: R/pool.table.R

pool.tableR Documentation

Combines estimates from a tidy table

Description

Combines estimates from a tidy table

Usage

pool.table(
  w,
  type = c("all", "minimal", "tests"),
  conf.int = TRUE,
  conf.level = 0.95,
  exponentiate = FALSE,
  dfcom = Inf,
  custom.t = NULL,
  rule = c("rubin1987", "reiter2003"),
  ...
)

Arguments

w

A data.frame with parameter estimates in tidy format (see details).

type

A string, either "minimal", "tests" or "all". Use minimal to mimick the output of summary(pool(fit)). The default is "all".

conf.int

Logical indicating whether to include a confidence interval.

conf.level

Confidence level of the interval, used only if conf.int = TRUE. Number between 0 and 1.

exponentiate

Flag indicating whether to exponentiate the coefficient estimates and confidence intervals (typical for logistic regression).

dfcom

A positive number representing the degrees of freedom of the residuals in the complete-data analysis. The dfcom argument is used for the Barnard-Rubin adjustment. In a linear regression, dfcom would be equivalent to the number of independent observation minus the number of fitted parameters, but the expression becomes more complex for regularized, proportional hazards, or other semi-parametric techniques. Only used if w lacks a column named "df.residual".

custom.t

A custom character string to be parsed as a calculation rule for the total variance t. The custom rule can use the other calculated pooling statistics. The default t calculation has the form ".data$ubar + (1 + 1 / .data$m) * .data$b".

rule

A string indicating the pooling rule. Currently supported are "rubin1987" (default, for analyses applied to multiply-imputed incomplete data) and "reiter2003" (for analyses applied to synthetic data created from complete data).

...

Arguments passed down

Details

The input data w is a data.frame with columns named:

term a character or factor with the parameter names
estimate a numeric vector with parameter estimates
std.error a numeric vector with standard errors of estimate
residual.df a numeric vector with the degrees of freedom

Columns 1-3 are obligatory. Column 4 is optional. Usually, all entries in column 4 are the same. The user can omit column 4, and specify argument pool.table(..., dfcom = ...) instead. If both are given, then column residual.df takes precedence. If neither are specified, then mice tries to calculate the residual degrees of freedom. If that fails (e.g. because there is no information on sample size), mice sets dfcom = Inf. The value dfcom = Inf is acceptable for large samples (n > 1000) and relatively concise parametric models.

Value

pool.table() returns a data.frame with aggregated estimates, standard errors, confidence intervals and statistical tests.

The meaning of the columns is as follows:

term Parameter name
m Number of multiple imputations
estimate Pooled complete data estimate
std.error Standard error of estimate
statistic t-statistic = estimate / std.error
df Degrees of freedom for statistic
p.value One-sided P-value under null hypothesis
conf.low Lower bound of c.i. (default 95 pct)
conf.high Upper bound of c.i. (default 95 pct)
riv Relative increase in variance
fmi Fraction of missing information
ubar Within-imputation variance of estimate
b Between-imputation variance of estimate
t Total variance, of estimate
dfcom Residual degrees of freedom in complete data

Examples

# conventional mice workflow
imp <- mice(nhanes2, m = 2, maxit = 2, seed = 1, print = FALSE)
fit <- with(imp, lm(chl ~ age + bmi + hyp))
pld1 <- pool(fit)
pld1$pooled

# using pool.table() on tidy table
tbl <- summary(fit)[, c("term", "estimate", "std.error", "df.residual")]
tbl
pld2 <- pool.table(tbl, type = "minimal")
pld2

identical(pld1$pooled, pld2)

# conventional workflow: all numerical output
all1 <- summary(pld1, type = "all", conf.int = TRUE)
all1

# pool.table workflow: all numerical output
all2 <- pool.table(tbl)
all2

identical(data.frame(all1), all2)

stefvanbuuren/mice documentation built on April 21, 2024, 7:37 a.m.