icc | R Documentation |
These functions compute intraclass correlation coefficients (ICCs), as well as corresponding large sample variances and covariances based on the unconditional variance components in multilevel designs with up to four hierarchical levels (L), implementing the formulas given in Donner & Koval (1980) and Hedges et al. (2012).
icc.2l.balanced()
computes the ICC at L2 and its variance
in a balanced two-level design (Hedges et al., 2012, Equation (1)).
icc.2l.unbalanced()
computes the ICC at L2 and its variance
in an unbalanced two-level design (Donner & Koval, 1980, Equation (3)).
icc.3l.balanced()
computes the ICC at L2 and L3, their variances,
and covariance in a balanced three-level design
(Hedges et al., 2012, Equations (4) to (6)).
icc.3l.unbalanced()
computes the ICC at L2 and L3, their variances,
and the covariance between the variance components at L2 and L3 in an
unbalanced three-level design (Hedges et al., 2012, Equations (7) to (9)).
icc.4l.balanced()
computes the ICC at L2, L3, and L4, their
variances, and covariances in a balanced four-level design
(Hedges et al., 2012, Equations (10) to (15)).
icc.2l.balanced(data, var_l1, var_l2, se_var_l2)
icc.2l.unbalanced(data, var_l1, var_l2, N, n_per_l2)
icc.3l.balanced(data, var_l1, var_l2, var_l3, se_var_l2, se_var_l3, j)
icc.3l.unbalanced(
data,
var_l1,
var_l2,
var_l3,
se_var_l2,
se_var_l3,
n_per_l2,
id_l3
)
icc.4l.balanced(
data,
var_l1,
var_l2,
var_l3,
var_l4,
se_var_l2,
se_var_l3,
se_var_l4,
j,
k
)
data |
A data frame. See details. |
var_l1 |
< |
var_l2 |
< |
se_var_l2 |
< |
N |
< |
n_per_l2 |
< |
var_l3 |
< |
se_var_l3 |
< |
j |
< |
id_l3 |
< |
var_l4 |
< |
se_var_l4 |
< |
k |
< |
Intraclass correlation coefficients (ICCs)
Outcomes of observations within the same cluster are usually not independent
but rather correlated. The ICC informs on the degree of within-cluster
similarity. It quantifies the proportion of total variance in an outcome
that can be attributed to differences between clusters. Formally, the ICC
is the ratio of the between-cluster variance component to the sum of the
within- and between-cluster variance components. Although originated in the
two-level context (see Fisher, 1925), the concept
of the ICC naturally extends to more complex multilevel data structures.
For example, in a three-level design, where students at L1 are nested within
classrooms at L2, which are in turn nested within schools at L3 and the
outcome of interest is student achievement, the ICC at L2
(\sigma^2_{L2}/\sigma^2_{Total}
)
depicts between-classroom (within-school) achievement differences
and the ICC at L3
(\sigma^2_{L3}/\sigma^2_{Total}
)
depicts between-school achievement differences
(where \sigma^2_{Total}=\sigma^2_{L1}+\sigma^2_{L2}+\sigma^2_{L3}
).
Balanced and unbalanced designs
Following Hedges et al. (2012), the functions provided differentiate between balanced and unbalanced designs.
A balanced design assumes equal cluster sizes (i.e., all clusters have the same number of lower-level units). Here, the cluster size may be an average across clusters, e.g., in terms of the (harmonic) mean or median number of lower-level units per cluster.
An unbalanced design, in contrast, allows the clusters to vary in their size (i.e., clusters may contain a different number of lower-level units).
The differentiation between balanced and unbalanced designs is only relevant to compute the (co)variances for the multilevel variance component structure. Thus, when only ICCs should be returned (by not specifying the arguments on standard errors and sample or cluster sizes), the functions for balanced and unbalanced designs reveal equivalent results.
Structure of the data frame supplied to data
If a grouped data frame is supplied to data
, this same grouping structure
will be retained in the new data frame.
In unbalanced designs, the data frame supplied to data
should have one
single row for each L2 cluster and should contain a variable supplied
to n_per_l2
that informs on the size of each L2 cluster
(i.e., the number of L1 units in each L2 cluster). To generate a data frame
of this structure, the multides::cluster_size()
function can be used while keeping only distinct rows for the L2 cluster
identifier.
Note that in unbalanced designs, the results are summarized across cluster
sizes, unless (co)variances are not computed.
Further notes
The variance of the ICC in an unbalanced two-level design (as formulated in Donner & Koval, 1980, Equation (3)) does not depend on the standard error of the variance component at L2, but rather solely on the total sample size at L1 and the (varying) cluster sizes at L2.
Covariances between ICCs at different hierarchical levels are computed for balanced designs only. Note that the covariance computed for an unbalanced three-level design is not the covariance between the ICCs at L2 and L3, but rather the covariance between the variance components at L2 and L3.
If standard errors of variance components and/or sample or cluster sizes are not supplied, no (co)variances will be returned. In this case, a respective message is printed.
A data frame that contains the following estimates:
icc.2l.balanced()
and icc.2l.unbalanced()
icc
The ICC.
var_icc
The variance of the ICC.
icc.3l.balanced()
icc_l2
The ICC at L2.
var_icc_l2
The variance of the ICC at L2.
icc_l3
The ICC at L3.
var_icc_l3
The variance of the ICC at L3.
cov_icc_l2_l3
The covariance of the ICCs at L2 and L3.
icc.3l.unbalanced()
icc_l2
The ICC at L2.
var_icc_l2
The variance of the ICC at L2.
icc_l3
The ICC at L3.
var_icc_l3
The variance of the ICC at L3.
cov_var_l2_l3
The covariance of the variance components at L2
and L3.
icc.4l.balanced()
icc_l2
The ICC at L2.
var_icc_l2
The variance of the ICC at L2.
icc_l3
The ICC at L3.
var_icc_l3
The variance of the ICC at L3.
icc_l4
The ICC at L4.
var_icc_l4
The variance of the ICC at L4.
cov_icc_l2_l3
The covariance of the ICCs at L2 and L3.
cov_icc_l2_l4
The covariance of the ICCs at L2 and L4.
cov_icc_l3_l4
The covariance of the ICCs at L3 and L4.
Donner, A., & Koval, J. J. (1980). The large sample variance of an intraclass correlation. Biometrika, 67(3), 719–722. https://doi.org/10.1093/biomet/67.3.719
Fisher, R. A. (1925). Statistical methods for research workers. Oliver & Boyd.
Hedges, L. V., Hedberg, E. C., & Kuyper, A. M. (2012). The variance of intraclass correlations in three- and four-level models. Educational and Psychological Measurement, 72(6), 893–909. https://doi.org/10.1177/0013164412445193
set.seed(42)
dat <- data.frame(v_l1 = runif(1,.5,1),
v_l2 = runif(1,0,.5),
se_v_l2 = runif(1,.01,.1))
# compute ICC with corresponding sampling variance
# in a balanced two-level design
icc.2l.balanced(data = dat,
var_l1 = v_l1, var_l2 = v_l2, se_var_l2 = se_v_l2)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.