icc: Compute intraclass correlation coefficients with variances...

iccR Documentation

Compute intraclass correlation coefficients with variances and covariances

Description

These functions compute intraclass correlation coefficients (ICCs), as well as corresponding large sample variances and covariances based on the unconditional variance components in multilevel designs with up to four hierarchical levels (L), implementing the formulas given in Donner & Koval (1980) and Hedges et al. (2012).

  • icc.2l.balanced() computes the ICC at L2 and its variance in a balanced two-level design (Hedges et al., 2012, Equation (1)).

  • icc.2l.unbalanced() computes the ICC at L2 and its variance in an unbalanced two-level design (Donner & Koval, 1980, Equation (3)).

  • icc.3l.balanced() computes the ICC at L2 and L3, their variances, and covariance in a balanced three-level design (Hedges et al., 2012, Equations (4) to (6)).

  • icc.3l.unbalanced() computes the ICC at L2 and L3, their variances, and the covariance between the variance components at L2 and L3 in an unbalanced three-level design (Hedges et al., 2012, Equations (7) to (9)).

  • icc.4l.balanced() computes the ICC at L2, L3, and L4, their variances, and covariances in a balanced four-level design (Hedges et al., 2012, Equations (10) to (15)).

Usage

icc.2l.balanced(data, var_l1, var_l2, se_var_l2)

icc.2l.unbalanced(data, var_l1, var_l2, N, n_per_l2)

icc.3l.balanced(data, var_l1, var_l2, var_l3, se_var_l2, se_var_l3, j)

icc.3l.unbalanced(
  data,
  var_l1,
  var_l2,
  var_l3,
  se_var_l2,
  se_var_l3,
  n_per_l2,
  id_l3
)

icc.4l.balanced(
  data,
  var_l1,
  var_l2,
  var_l3,
  var_l4,
  se_var_l2,
  se_var_l3,
  se_var_l4,
  j,
  k
)

Arguments

data

A data frame. See details.

var_l1

<data-masked> The column name of the variance component at L1.

var_l2

<data-masked> The column name of the variance component at L2.

se_var_l2

<data-masked> Optional. The column name of the standard error of the variance component at L2.

N

<data-masked> Optional. The column name of the total sample size at L1.

n_per_l2

<data-masked> Optional. The column name of the (varying) cluster sizes at L2 (i.e., the number of L1 units in each L2 cluster). Cluster sizes broken down for each cluster may be obtained via multides::cluster_size().

var_l3

<data-masked> The column name of the variance component at L3.

se_var_l3

<data-masked> Optional. The column name of the standard error of the variance component at L3.

j

<data-masked> Optional. The column name of the average cluster size at L3 (e.g., the (harmonic) mean or median number of L2 clusters per L3 cluster).

id_l3

<data-masked> Optional. The column name of the L3 cluster identifier.

var_l4

<data-masked> The column name of the variance component at L4.

se_var_l4

<data-masked> Optional. The column name of the standard error of the variance component at L4.

k

<data-masked> Optional. The column name of the average cluster size at L4 (e.g., the (harmonic) mean or median number of L3 clusters per L4 cluster).

Details

Intraclass correlation coefficients (ICCs)

Outcomes of observations within the same cluster are usually not independent but rather correlated. The ICC informs on the degree of within-cluster similarity. It quantifies the proportion of total variance in an outcome that can be attributed to differences between clusters. Formally, the ICC is the ratio of the between-cluster variance component to the sum of the within- and between-cluster variance components. Although originated in the two-level context (see Fisher, 1925), the concept of the ICC naturally extends to more complex multilevel data structures. For example, in a three-level design, where students at L1 are nested within classrooms at L2, which are in turn nested within schools at L3 and the outcome of interest is student achievement, the ICC at L2 (\sigma^2_{L2}/\sigma^2_{Total}) depicts between-classroom (within-school) achievement differences and the ICC at L3 (\sigma^2_{L3}/\sigma^2_{Total}) depicts between-school achievement differences (where \sigma^2_{Total}=\sigma^2_{L1}+\sigma^2_{L2}+\sigma^2_{L3}).

Balanced and unbalanced designs

Following Hedges et al. (2012), the functions provided differentiate between balanced and unbalanced designs.

  • A balanced design assumes equal cluster sizes (i.e., all clusters have the same number of lower-level units). Here, the cluster size may be an average across clusters, e.g., in terms of the (harmonic) mean or median number of lower-level units per cluster.

  • An unbalanced design, in contrast, allows the clusters to vary in their size (i.e., clusters may contain a different number of lower-level units).

The differentiation between balanced and unbalanced designs is only relevant to compute the (co)variances for the multilevel variance component structure. Thus, when only ICCs should be returned (by not specifying the arguments on standard errors and sample or cluster sizes), the functions for balanced and unbalanced designs reveal equivalent results.

Structure of the data frame supplied to data

If a grouped data frame is supplied to data, this same grouping structure will be retained in the new data frame.

In unbalanced designs, the data frame supplied to data should have one single row for each L2 cluster and should contain a variable supplied to n_per_l2 that informs on the size of each L2 cluster (i.e., the number of L1 units in each L2 cluster). To generate a data frame of this structure, the multides::cluster_size() function can be used while keeping only distinct rows for the L2 cluster identifier. Note that in unbalanced designs, the results are summarized across cluster sizes, unless (co)variances are not computed.

Further notes

  • The variance of the ICC in an unbalanced two-level design (as formulated in Donner & Koval, 1980, Equation (3)) does not depend on the standard error of the variance component at L2, but rather solely on the total sample size at L1 and the (varying) cluster sizes at L2.

  • Covariances between ICCs at different hierarchical levels are computed for balanced designs only. Note that the covariance computed for an unbalanced three-level design is not the covariance between the ICCs at L2 and L3, but rather the covariance between the variance components at L2 and L3.

  • If standard errors of variance components and/or sample or cluster sizes are not supplied, no (co)variances will be returned. In this case, a respective message is printed.

Value

A data frame that contains the following estimates:

  • icc.2l.balanced() and icc.2l.unbalanced()

    • icc The ICC.

    • var_icc The variance of the ICC.

  • icc.3l.balanced()

    • icc_l2 The ICC at L2.

    • var_icc_l2 The variance of the ICC at L2.

    • icc_l3 The ICC at L3.

    • var_icc_l3 The variance of the ICC at L3.

    • cov_icc_l2_l3 The covariance of the ICCs at L2 and L3.

  • icc.3l.unbalanced()

    • icc_l2 The ICC at L2.

    • var_icc_l2 The variance of the ICC at L2.

    • icc_l3 The ICC at L3.

    • var_icc_l3 The variance of the ICC at L3.

    • cov_var_l2_l3 The covariance of the variance components at L2 and L3.

  • icc.4l.balanced()

    • icc_l2 The ICC at L2.

    • var_icc_l2 The variance of the ICC at L2.

    • icc_l3 The ICC at L3.

    • var_icc_l3 The variance of the ICC at L3.

    • icc_l4 The ICC at L4.

    • var_icc_l4 The variance of the ICC at L4.

    • cov_icc_l2_l3 The covariance of the ICCs at L2 and L3.

    • cov_icc_l2_l4 The covariance of the ICCs at L2 and L4.

    • cov_icc_l3_l4 The covariance of the ICCs at L3 and L4.

References

Donner, A., & Koval, J. J. (1980). The large sample variance of an intraclass correlation. Biometrika, 67(3), 719–722. https://doi.org/10.1093/biomet/67.3.719

Fisher, R. A. (1925). Statistical methods for research workers. Oliver & Boyd.

Hedges, L. V., Hedberg, E. C., & Kuyper, A. M. (2012). The variance of intraclass correlations in three- and four-level models. Educational and Psychological Measurement, 72(6), 893–909. https://doi.org/10.1177/0013164412445193

Examples

set.seed(42)
dat <- data.frame(v_l1 = runif(1,.5,1),
                  v_l2 = runif(1,0,.5),
                  se_v_l2 = runif(1,.01,.1))

# compute ICC with corresponding sampling variance
# in a balanced two-level design
icc.2l.balanced(data = dat,
                var_l1 = v_l1, var_l2 = v_l2, se_var_l2 = se_v_l2)


sophiestallasch/multides documentation built on Oct. 20, 2024, 5:14 a.m.