Downloading the Data

Here I use the example data in the book by Hox (2010, p. 17).

knitr::opts_chunk$set(echo = TRUE, message = FALSE)
comma <- function(x) format(x, digits = 2, big.mark = ",")
library(tidyverse)
library(haven)
library(lme4)
library(boot)
library(bootmlm)
popdata <- haven::read_dta(
  "https://stats.idre.ucla.edu/stat/stata/examples/mlm_ma_hox/popular.dta")

For the interest of time, I will select only 20 schools and a few students in each school:

set.seed(85957)
pop_sub <- popdata %>% filter(school %in% sample(unique(school), 20)) %>%
  group_by(school) %>% sample_frac(size = .25) %>% ungroup()

To illustrate doing bootstrapping to obtain standard error (SE) and confidence interval (CI) for the intraclass correlation (ICC) of the variable popular, we first fit an intercept only model:

m0 <- lmer(popular ~ (1 | school), data = pop_sub)
summary(m0)

The intraclass correlation is defined as $$\rho = \frac{\tau^2}{\tau^2 + \sigma^2} = \frac{1}{1 + \sigma^2 / \tau^2} = \frac{1}{1 + \theta^{-2}},$$ where $\theta = \tau / \sigma$ is the relative cholesky factor for the random intercept term used in lme4. Therefore, we can estimate the ICC as:

(icc0 <- 1 / (1 + getME(m0, "theta")^(-2)))

So the ICC is quite large for this data set. However, it is important to also quantify the uncertainty of a point estimate. We can get the asymptotic standard error (ASE) of the ICC using the delta method:

# Get the point estimate of theta:
th_est <- getME(m0, "theta")
# Get the variance of theta using the bootmlm::vcov_theta() function
th_var <- bootmlm::vcov_theta(m0)
# We can get the asymptotic variance of ICC using delta method with an 
# appropriate transformation. `x1` is the variable name needed for the first
# variable, which is theta in our case
icc_avar <- msm::deltamethod(g = ~ 1 / (1 + x1^(-2)), 
                            mean = th_est, cov = th_var, ses = FALSE)

So we can get an approximate symmetric CI with

icc0 + c(-2, 2) * sqrt(icc_avar)

Bootstrap-$t$ CI

icc <- function(x) {
  th_est <- x@theta
  est <- 1 / (1 + th_est^(-2))
  th_var <- vcov_theta(x)
  var <- msm::deltamethod(g = ~ 1 / (1 + x1^(-2)), 
                            mean = th_est, cov = th_var, ses = FALSE)
  c(est, var)
}
icc(m0)
boo <- bootstrap_mer(m0, icc, 999L, type = "case")
# Influence values for BCa
inf_val <- empinf_mer(m0, icc, index = 1)
boot::boot.ci(boo, L = inf_val)

On a Transformed Scale

Because ICC is bounded between $-1$ and 1, and commonly between 0 and 1, a potentially better approach to construct a confidence interval for ICC is to first transform it so that the sampling distribution is closer to normal. As stated in Ukoumunne, Davison, Gulliford, and Chinn (2003), one can use apply a transformation: $$h(t) = \log(\frac{\hat \rho}{1 - \hat \rho})$$

You can compare the distribution before and after transformation:

hist(boo$t[ , 1])
hist(-log(1 / boo$t[ , 1] - 1))

One can obtain the CI on the transformed scale, and then scale it back to the original ICC scale:

boot::boot.ci(boo, L = inf_val, h = qlogis, 
              hdot = function(x) 1 / (x - x^2), 
              hinv = plogis)

Note: Percentile and BCa intervals are invariant to 1-1 transformation.



marklhc/bootmlm documentation built on May 24, 2023, 9:59 a.m.