We use the dataset bfi
from the package psych
together with lavaan
to estimate some realistic factor loadings $\lambda$ and standard deviations
$\sigma$.
model <- c("y =~ A1 + A2 + A3 + A4 + A5") fit <- lavaan::cfa(model, data = psych::bfi) coefs <- lavaan::lavInspect(fit, what = "x") lambda <- abs(c(coefs$lambda * sqrt(as.numeric(coefs$psi)))) sigma <- sqrt(diag(lavaan::lavInspect(fit, what = "x")$theta))
We take the absolute value of the lambda
vector as the agreement data
contains reverse-coded items.
We compare five confidence intervals, all without transformations. The adf
interval is the asymptotic distribution-free interval, the ell
interval
is the interval based on elliptical distributions and a kurtosis correction,
the ell_par
is the elliptical interval assuming a parallel model. The same
comments hold for norm
(assuming normal data) and norm_par
(assuming
parallel normal data).
library("alphaci") library("future.apply") plan(multisession, workers = availableCores() - 2) set.seed(313) n_reps <- 10000 k <- 5 latent <- \(n) extraDistr::rlaplace(n) / sqrt(2) true <- alphaci:::alpha(sigma, lambda)
In this simulation we normal error terms and a Laplace-distributed latent
variable. This one has excess kurtosis $3$, which caries over in large part
to the data. k
is the number of questions ands n_reps
is the number of
simulations.
success <- \(ci) true <= ci[2] & true >= ci[1] len <- \(ci) ci[2] - ci[1] simulations <- \(n) { results <- future.apply::future_replicate(n_reps, { x <- alphaci:::simulate_congeneric(n, k, sigma, lambda, latent = latent) cis <- rbind(adf = alphaci(x, type = "adf"), adf_par = alphaci(x, type = "adf", parallel = TRUE), ell = alphaci(x, type = "elliptical"), ell_par = alphaci(x, type = "elliptical", parallel = TRUE), norm = alphaci(x, type = "normal"), norm_par = alphaci(x, type = "normal", parallel = TRUE) ) c(cov = apply(cis, 1, success), len = apply(cis, 1, len)) }, future.seed = TRUE) rowMeans(results) }
Let's check out the results when $n= 10$.
simulations(10) #> cov.adf cov.adf_par cov.ell cov.ell_par cov.norm cov.norm_par len.adf len.adf_par #> 0.8745000 0.7942000 0.9513000 0.9566000 0.9223000 0.9297000 0.7680810 55.3239940 #> len.ell len.ell_par len.norm len.norm_par #> 1.0238478 1.0499270 0.9113473 0.9342908
It appears that the kurtosis corrections work well, at least for small sample size. Let's see how they perform when $n$ increases.
nn <- c(5, 10, 20, 30, 40, 50, 100, 200, 500, 1000, 2000, 5000) results <- sapply(nn, simulations)
Plotting the coverages, we find, where 1
is asymptotically distribution-free, 2
is elliptical, 3
is paralell and elliptical, 4
is normal
and 5
is parallel and normal.
matplot(nn, t(results[1:5, ]), xlab = "n", ylab = "Coverage", type = "b", log = "x") abline(h = 0.95, lty = 2)
Hence the kurtosis correction intervals have better coverage than the adf
interval when $n\leq 50$ and outperforms the normal theory intervals for
all $n$. If this observation is general remains to be seen.
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.