icc_rm_reml: Repeated-Measures Intraclass Correlation via REML

View source: R/icc.R

icc_rm_remlR Documentation

Repeated-Measures Intraclass Correlation via REML

Description

Computes pairwise repeated-measures intraclass correlation coefficients from long-format data using the same REML and Woodbury-identity backend used by the repeated-measures agreement models.

Usage

icc_rm_reml(
  data,
  response,
  subject,
  method = NULL,
  time = NULL,
  type = c("consistency", "agreement"),
  ci = FALSE,
  conf_level = 0.95,
  n_threads = getOption("matrixCorr.threads", 1L),
  ci_mode = c("auto", "raw", "logit"),
  verbose = FALSE,
  digits = 4,
  use_message = TRUE,
  interaction = FALSE,
  max_iter = 100,
  tol = 1e-06,
  Dmat = NULL,
  Dmat_type = c("time-avg", "typical-visit", "weighted-avg", "weighted-sq"),
  Dmat_weights = NULL,
  Dmat_rescale = TRUE,
  ar = c("none", "ar1"),
  ar_rho = NA_real_,
  slope = c("none", "subject", "method", "custom"),
  slope_var = NULL,
  slope_Z = NULL,
  drop_zero_cols = TRUE,
  vc_select = c("auto", "none"),
  vc_alpha = 0.05,
  vc_test_order = c("subj_time", "subj_method"),
  include_subj_method = NULL,
  include_subj_time = NULL,
  sb_zero_tol = 1e-10
)

## S3 method for class 'icc_rm_reml'
print(
  x,
  digits = 4,
  ci_digits = 4,
  n = NULL,
  topn = NULL,
  max_vars = NULL,
  width = NULL,
  show_ci = NULL,
  ...
)

## S3 method for class 'icc_rm_reml'
summary(
  object,
  digits = 4,
  ci_digits = 2,
  n = NULL,
  topn = NULL,
  max_vars = NULL,
  width = NULL,
  show_ci = NULL,
  ...
)

## S3 method for class 'summary.icc_rm_reml'
print(
  x,
  digits = NULL,
  n = NULL,
  topn = NULL,
  max_vars = NULL,
  width = NULL,
  show_ci = NULL,
  ...
)

Arguments

data

A data frame.

response

Character. Response variable name.

subject

Character. Subject ID variable name.

method

Character or NULL. Optional column name of method factor (added to fixed effects).

time

Character or NULL. Optional column name of time factor (added to fixed effects).

type

Character scalar; one of c("consistency","agreement"). For "consistency", systematic fixed method differences are not penalized through S_B. For "agreement", the fixed-effect dispersion term S_B is added to the denominator.

ci

Logical. If TRUE, return a CI container for the repeated ICC.

conf_level

Numeric in (0,1). Confidence level when ci = TRUE (default 0.95).

n_threads

Integer \geq 1. Number of OpenMP threads to use for computation. Defaults to getOption("matrixCorr.threads", 1L).

ci_mode

Character scalar; one of c("auto","raw","logit"). Controls how confidence intervals are computed when ci = TRUE. If "raw", a Wald CI is formed on the ICC scale and truncated to [0,1]. If "logit", a Wald CI is computed on the \mathrm{logit}(\mathrm{ICC}) scale and back-transformed to the original scale. If "auto" (default), the backend chooses per estimate between the raw-scale and logit-scale interval.

verbose

Logical. If TRUE, prints a structured summary of the fitted variance components and S_B for each fit. Default FALSE.

digits

Integer (\ge 0). Number of decimal places to use in the printed summary when verbose = TRUE. Default 4.

use_message

Logical. When verbose = TRUE, choose the printing mechanism, where TRUE uses message() (respects sink(), easily suppressible via suppressMessages()), whereas FALSE uses cat() to stdout. Default TRUE.

interaction

Logical. Include method:time interaction? (default FALSE).

max_iter

Integer. Maximum iterations for variance-component updates (default 100).

tol

Numeric. Convergence tolerance on parameter change (default 1e-6).

Dmat

Optional n_t \times n_t numeric matrix to weight/aggregate time-specific fixed biases in the S_B quadratic form. If supplied, it is used (after optional mass rescaling; see Dmat_rescale) whenever at least two present time levels exist; otherwise it is ignored. If Dmat is NULL, a canonical kernel D_m is constructed from Dmat_type and Dmat_weights (see below). Dmat should be symmetric positive semidefinite; small asymmetries are symmetrized internally.

Dmat_type

Character, one of c("time-avg","typical-visit", "weighted-avg","weighted-sq"). Only used when Dmat = NULL. It selects the aggregation target for time-specific fixed biases in S_B. Options are:

  • "time-avg": square of the time-averaged bias, D_m=(1/n_t)\,11^\top.

  • "typical-visit": average of squared per-time biases, D_m=I_{n_t}.

  • "weighted-avg": square of a weighted average, D_m=n_t\,w\,w^\top with \sum w=1.

  • "weighted-sq": weighted average of squared biases, D_m=n_t\,\mathrm{diag}(w) with \sum w=1.

Pick "time-avg" for ICC targeting the time-averaged measurement; pick "typical-visit" for ICC targeting a randomly sampled visit (typical occasion). Default "time-avg".

Dmat_weights

Optional numeric weights w used when Dmat_type %in% c("weighted-avg","weighted-sq"). Must be nonnegative and finite. If names(w) are provided, they should match the full time levels in data; they are aligned to the present time subset per fit. If unnamed, the length must equal the number of present time levels. In all cases w is internally normalized to sum to 1.

Dmat_rescale

Logical. When TRUE (default), the supplied/built D_m is rescaled to satisfy the simple mass rule 1^\top D_m 1 = n_t. This keeps the S_B denominator invariant and harmonizes with the time-averaging factors used for the variance terms.

ar

Character. Residual correlation structure: "none" (iid) or "ar1" for subject-level AR(1) correlation within contiguous time runs. Default c("none").

ar_rho

Numeric in (-0.999,\,0.999) or NA. If ar = "ar1" and ar_rho is finite, it is treated as fixed. If ar = "ar1" and ar_rho = NA, \rho is estimated by profiling a 1-D objective (REML when available; an approximation otherwise). Default NA_real_.

slope

Character. Optional extra random-effect design Z. With "subject" a single random slope is added (one column in Z); with "method" one column per method level is added; with "custom" you provide slope_Z directly. Default c("none","subject","method","custom").

slope_var

For slope %in% c("subject","method"), a character string giving the name of a column in data used as the slope regressor (e.g., centered time). It is looked up inside data; do not pass the vector itself. NAs are treated as zeros in Z.

slope_Z

For slope = "custom", a numeric matrix with n rows (same order as data) providing the full extra random-effect design Z. Each column of slope_Z has its own variance component \sigma_{Z,j}^2; columns are treated as uncorrelated (diagonal block in G). Ignored otherwise.

drop_zero_cols

Logical. When slope = "method", drop all-zero columns of Z after subsetting (useful in pairwise fits). Default TRUE.

vc_select

Character scalar; one of c("auto","none"). Controls how the subject by method \sigma^2_{A\times M} ("subj_method") and subject by time \sigma^2_{A\times T} ("subj_time") variance components are included. If "auto" (default), the function performs boundary-aware REML likelihood-ratio tests (LRTs; null on the boundary at zero with a half-\chi^2_1 reference) to decide whether to retain each component, in the order given by vc_test_order. If "none", no testing is done and inclusion is taken from include_subj_method/include_subj_time (or, if NULL, from the mere presence of the corresponding factor in the design). In pairwise fits, the decision is made independently for each method pair.

vc_alpha

Numeric scalar in (0,1); default 0.05. Per-component significance level for the boundary-aware REML LRTs used when vc_select = "auto". The tests are one-sided for variance components on the boundary and are not multiplicity-adjusted.

vc_test_order

Character vector (length 2) with a permutation of c("subj_time","subj_method"); default c("subj_time","subj_method"). Specifies the order in which the two variance components are tested when vc_select = "auto". The component tested first may be dropped before testing the second. If a factor is absent in the design (e.g., no time factor so "subj_time" is undefined), the corresponding test is skipped.

include_subj_method, include_subj_time

Logical scalars or NULL. When vc_select = "none", these control whether the \sigma^2_{A\times M} ("subj_method") and \sigma^2_{A\times T} ("subj_time") random effects are included (TRUE) or excluded (FALSE) in the model. If NULL (default), inclusion defaults to the presence of the corresponding factor in the data (i.e., at least two method/time levels). When vc_select = "auto", these arguments are ignored (automatic selection is used instead).

sb_zero_tol

Non-negative numeric scalar; default 1e-10. Numerical threshold for the fixed-effect dispersion term S_B. After computing \widehat{S_B} and its delta-method variance, if \widehat{S_B} \le sb_zero_tol or non-finite, the procedure treats S_B as fixed at zero in the delta step.

x

For print(), an object returned by icc_rm_reml() or summary.icc_rm_reml().

ci_digits

Integer; number of digits for confidence interval bounds in printed method summaries.

n

Optional row threshold for compact preview output.

topn

Optional number of leading/trailing rows to show when truncated.

max_vars

Optional maximum number of visible columns; NULL derives this from console width.

width

Optional display width; defaults to getOption("width").

show_ci

One of "yes" or "no".

...

Passed to the underlying display helpers.

object

For summary(), an object returned by icc_rm_reml().

Details

The repeated-measures model is fit separately for each method pair using the same REML and Woodbury-identity backend used by the repeated-measures concordance estimator.

Kernel D_m and the fixed-bias term S_B. Let d = L^\top \hat\beta stack the within-time, pairwise method differences, grouped by time. The symmetric positive semidefinite kernel D_m \succeq 0 selects which functional of the bias profile is targeted by S_B. Internally, the code rescales any supplied or constructed D_m to satisfy 1^\top D_m 1 = n_t for stability and comparability.

  • Dmat_type = "time-avg" targets the square of the time-averaged bias.

  • Dmat_type = "typical-visit" targets the average of squared per-time biases.

  • Dmat_type = "weighted-avg" targets the square of a weighted time average.

  • Dmat_type = "weighted-sq" targets the weighted average of squared per-time biases.

As in the repeated-measures concordance implementation, S_B is the fitted fixed-effect dispersion term induced by D_m. It enters the denominator only for type = "agreement".

Time-averaging and shrinkage factors. The fitted variance components reported in the summary are \sigma_S^2, \sigma_{S\times M}^2, \sigma_{S\times T}^2, and \sigma_E^2, stored respectively as sigma2_subject, sigma2_subject_method, sigma2_subject_time, and sigma2_error.

The repeated-measures ICC always uses \sigma_S^2 in the numerator only. The denominator uses the same time-averaging logic already used by the repeated-measures concordance backend, through two pair-specific factors \bar{\kappa}_g and \bar{\kappa}_e.

If time is absent, the implementation sets

\bar{\kappa}_g = 0, \qquad \bar{\kappa}_e = 1.

If time is present and the target is a single visit (Dmat_type %in% c("typical-visit", "weighted-sq")), the implementation leaves the time-varying terms unshrunk:

\bar{\kappa}_g = 1, \qquad \bar{\kappa}_e = 1.

If time is present and the target is time-averaged (Dmat_type %in% c("time-avg", "weighted-avg")), then for each observed subject-method unit with T distinct observed visits:

  • with equal weights,

    \kappa_g = \frac{1}{T}, \qquad \kappa_e = \frac{T + 2\sum_{h=1}^{T-1}(T-h)\rho^h}{T^2},

    with \kappa_e=\kappa_g when residuals are iid;

  • with normalized visit weights w_1,\ldots,w_T,

    \kappa_g = \sum_{t=1}^{T} w_t^2, \qquad \kappa_e = \sum_{t=1}^{T}\sum_{s=1}^{T} w_t w_s \rho^{|t-s|},

    with \kappa_e=\kappa_g when residuals are iid.

With unbalanced T, the implementation averages the per-unit \kappa values across the observations contributing to the pair and then clamps both \bar{\kappa}_g and \bar{\kappa}_e to [10^{-12},\,1] for numerical stability.

Repeated-measures ICC. Let \sigma_{S\times M,\mathrm{eff}}^2 denote the effective subject-by-method variance term, equal to \sigma_{S\times M}^2 when the subject-by-method random effect is included and 0 otherwise. Let \sigma_{S\times T,\mathrm{eff}}^2 denote the effective subject-by-time variance term, equal to \sigma_{S\times T}^2 when the subject-by-time random effect is included and 0 otherwise.

For type = "consistency", the reported ICC is

\mathrm{ICC}_{\mathrm{consistency}} \;=\; \frac{\sigma_S^2}{ \sigma_S^2 + \sigma_{S\times M,\mathrm{eff}}^2 + \bar{\kappa}_g\,\sigma_{S\times T,\mathrm{eff}}^2 + \bar{\kappa}_e\,\sigma_E^2}.

For type = "agreement", the denominator additionally includes S_B:

\mathrm{ICC}_{\mathrm{agreement}} \;=\; \frac{\sigma_S^2}{ \sigma_S^2 + \sigma_{S\times M,\mathrm{eff}}^2 + \bar{\kappa}_g\,\sigma_{S\times T,\mathrm{eff}}^2 + \bar{\kappa}_e\,\sigma_E^2 + S_B}.

This differs from repeated-measures concordance because ICC uses only \sigma_S^2 in the numerator, whereas the concordance numerator also includes the time-averaged subject-time term. Extra random-effect variances \{\sigma_{Z,j}^2\} from slope / slope_Z are estimated by the shared backend but are not included in the ICC denominator.

CIs / SEs (delta method for ICC). Confidence intervals are built from the same REML fit by a large-sample delta method. If ci_mode = "raw", a Wald interval is formed on the ICC scale,

\widehat{\mathrm{ICC}} \;\pm\; z_{1-\alpha/2}\, \widehat{\mathrm{se}}\{\widehat{\mathrm{ICC}}\},

and truncated to [0,1]. If ci_mode = "logit", the same Wald construction is applied on the logit scale and then back-transformed. If ci_mode = "auto", the backend selects between the raw-scale and logit-scale interval per estimate.

Choosing \rho for AR(1). When ar="ar1" and ar_rho = NA, \rho is estimated by profiling the REML log-likelihood at (\hat\beta,\hat G,\hat\sigma_E^2). With very few visits per subject, \rho can be weakly identified; consider sensitivity checks over a plausible range.

Value

A repeated-measures pairwise ICC object. Without confidence intervals the result is a symmetric matrix of class c("icc_rm_reml", "icc", "matrix"). With confidence intervals it is a list with est, lwr.ci, and upr.ci and class c("icc_rm_reml", "icc_ci", "icc"). Both carry the fitted variance-component matrices as attributes.

Notes on stability and performance

All per-subject solves are r\times r with r = 1 + n_m + n_t + q_Z, so cost scales with the number of subjects and the fixed-effects dimension rather than the total number of observations. Solvers use symmetric positive-definite paths with a small diagonal ridge and pseudo-inverse fallback, which helps for very small or unbalanced subsets and near-boundary estimates. For AR(1), observations are ordered by time within subject; NA time codes break the run, and gaps between factor levels are treated as regular steps.

Heteroscedastic slopes across Z columns are supported. Each Z column has its own variance component \sigma_{Z,j}^2, but cross-covariances among Z columns are set to zero.

Threading and BLAS guards

The C++ backend uses OpenMP loops while also forcing vendor BLAS libraries to run single-threaded so that overall CPU usage stays predictable. On OpenBLAS and Apple's Accelerate this is handled automatically. On Intel MKL builds the guard is disabled by default, but you can also opt out manually by setting MATRIXCORR_DISABLE_BLAS_GUARD=1 in the environment before loading the package.

References

Shrout PE, Fleiss JL (1979). Intraclass correlations: uses in assessing rater reliability. Psychological Bulletin, 86(2), 420-428.

McGraw KO, Wong SP (1996). Forming inferences about some intraclass correlation coefficients. Psychological Methods, 1(1), 30-46.

See Also

icc(), ccc(), ccc_rm_reml(), ba(), ba_rm(), rmcorr()

Examples

set.seed(321)
n_id <- 20
n_time <- 3
id <- factor(rep(seq_len(n_id), each = 2 * n_time))
method <- factor(rep(rep(c("A", "B"), each = n_time), times = n_id))
time <- factor(rep(seq_len(n_time), times = 2 * n_id))

subj <- rnorm(n_id, sd = 1)[as.integer(id)]
subj_method <- rnorm(n_id * 2, sd = 0.25)
sm <- subj_method[(as.integer(id) - 1L) * 2L + as.integer(method)]
y <- subj + sm + 0.3 * (method == "B") + rnorm(length(id), sd = 0.35)

dat_rm <- data.frame(y = y, id = id, method = method, time = time)

fit_icc_rm <- icc_rm_reml(
  dat_rm,
  response = "y",
  subject = "id",
  method = "method",
  time = "time",
  type = "consistency"
)
print(fit_icc_rm)
summary(fit_icc_rm)


matrixCorr documentation built on April 18, 2026, 5:06 p.m.