icc_rm_reml: Repeated-Measures Intraclass Correlation via REML
In matrixCorr: Collection of Correlation and Association Estimators

icc_rm_reml

R Documentation

Repeated-Measures Intraclass Correlation via REML

Description

Computes pairwise repeated-measures intraclass correlation coefficients from long-format data using the same REML and Woodbury-identity backend used by the repeated-measures agreement models.

Usage

icc_rm_reml(
  data,
  response,
  subject,
  method = NULL,
  time = NULL,
  type = c("consistency", "agreement"),
  ci = FALSE,
  conf_level = 0.95,
  n_threads = getOption("matrixCorr.threads", 1L),
  ci_mode = c("auto", "raw", "logit"),
  verbose = FALSE,
  digits = 4,
  use_message = TRUE,
  interaction = FALSE,
  max_iter = 100,
  tol = 1e-06,
  Dmat = NULL,
  Dmat_type = c("time-avg", "typical-visit", "weighted-avg", "weighted-sq"),
  Dmat_weights = NULL,
  Dmat_rescale = TRUE,
  ar = c("none", "ar1"),
  ar_rho = NA_real_,
  slope = c("none", "subject", "method", "custom"),
  slope_var = NULL,
  slope_Z = NULL,
  drop_zero_cols = TRUE,
  vc_select = c("auto", "none"),
  vc_alpha = 0.05,
  vc_test_order = c("subj_time", "subj_method"),
  include_subj_method = NULL,
  include_subj_time = NULL,
  sb_zero_tol = 1e-10
)

## S3 method for class 'icc_rm_reml'
print(
  x,
  digits = 4,
  ci_digits = 4,
  n = NULL,
  topn = NULL,
  max_vars = NULL,
  width = NULL,
  show_ci = NULL,
  ...
)

## S3 method for class 'icc_rm_reml'
summary(
  object,
  digits = 4,
  ci_digits = 2,
  n = NULL,
  topn = NULL,
  max_vars = NULL,
  width = NULL,
  show_ci = NULL,
  ...
)

## S3 method for class 'summary.icc_rm_reml'
print(
  x,
  digits = NULL,
  n = NULL,
  topn = NULL,
  max_vars = NULL,
  width = NULL,
  show_ci = NULL,
  ...
)

Arguments

`data`	A data frame.
`response`	Character. Response variable name.
`subject`	Character. Subject ID variable name.
`method`	Character or `NULL`. Optional column name of method factor (added to fixed effects).
`time`	Character or `NULL`. Optional column name of time factor (added to fixed effects).
`type`	Character scalar; one of `c("consistency","agreement")`. For `"consistency"`, systematic fixed method differences are not penalized through `S_B`. For `"agreement"`, the fixed-effect dispersion term `S_B` is added to the denominator.
`ci`	Logical. If `TRUE`, return a CI container for the repeated ICC.
`conf_level`	Numeric in `(0,1)`. Confidence level when `ci = TRUE` (default `0.95`).
`n_threads`	Integer `\geq 1`. Number of OpenMP threads to use for computation. Defaults to `getOption("matrixCorr.threads", 1L)`.
`ci_mode`	Character scalar; one of `c("auto","raw","logit")`. Controls how confidence intervals are computed when `ci = TRUE`. If `"raw"`, a Wald CI is formed on the ICC scale and truncated to `[0,1]`. If `"logit"`, a Wald CI is computed on the `\mathrm{logit}(\mathrm{ICC})` scale and back-transformed to the original scale. If `"auto"` (default), the backend chooses per estimate between the raw-scale and logit-scale interval.
`verbose`	Logical. If `TRUE`, prints a structured summary of the fitted variance components and `S_B` for each fit. Default `FALSE`.
`digits`	Integer `(\ge 0)`. Number of decimal places to use in the printed summary when `verbose = TRUE`. Default `4`.
`use_message`	Logical. When `verbose = TRUE`, choose the printing mechanism, where `TRUE` uses `message()` (respects `sink()`, easily suppressible via `suppressMessages()`), whereas `FALSE` uses `cat()` to `stdout`. Default `TRUE`.
`interaction`	Logical. Include `method:time` interaction? (default `FALSE`).
`max_iter`	Integer. Maximum iterations for variance-component updates (default `100`).
`tol`	Numeric. Convergence tolerance on parameter change (default `1e-6`).
`Dmat`	Optional `n_t \times n_t` numeric matrix to weight/aggregate time-specific fixed biases in the `S_B` quadratic form. If supplied, it is used (after optional mass rescaling; see `Dmat_rescale`) whenever at least two present time levels exist; otherwise it is ignored. If `Dmat` is `NULL`, a canonical kernel `D_m` is constructed from `Dmat_type` and `Dmat_weights` (see below). `Dmat` should be symmetric positive semidefinite; small asymmetries are symmetrized internally.
`Dmat_type`	Character, one of `c("time-avg","typical-visit", "weighted-avg","weighted-sq")`. Only used when `Dmat = NULL`. It selects the aggregation target for time-specific fixed biases in `S_B`. Options are: `"time-avg"`: square of the time-averaged bias, `D_m=(1/n_t)\,11^\top`. `"typical-visit"`: average of squared per-time biases, `D_m=I_{n_t}`. `"weighted-avg"`: square of a weighted average, `D_m=n_t\,w\,w^\top` with `\sum w=1`. `"weighted-sq"`: weighted average of squared biases, `D_m=n_t\,\mathrm{diag}(w)` with `\sum w=1`. Pick `"time-avg"` for ICC targeting the time-averaged measurement; pick `"typical-visit"` for ICC targeting a randomly sampled visit (typical occasion). Default `"time-avg"`.
`Dmat_weights`	Optional numeric weights `w` used when `Dmat_type %in% c("weighted-avg","weighted-sq")`. Must be nonnegative and finite. If `names(w)` are provided, they should match the full time levels in `data`; they are aligned to the present time subset per fit. If unnamed, the length must equal the number of present time levels. In all cases `w` is internally normalized to sum to 1.
`Dmat_rescale`	Logical. When `TRUE` (default), the supplied/built `D_m` is rescaled to satisfy the simple mass rule `1^\top D_m 1 = n_t`. This keeps the `S_B` denominator invariant and harmonizes with the time-averaging factors used for the variance terms.
`ar`	Character. Residual correlation structure: `"none"` (iid) or `"ar1"` for subject-level AR(1) correlation within contiguous time runs. Default `c("none")`.
`ar_rho`	Numeric in `(-0.999,\,0.999)` or `NA`. If `ar = "ar1"` and `ar_rho` is finite, it is treated as fixed. If `ar = "ar1"` and `ar_rho = NA`, `\rho` is estimated by profiling a 1-D objective (REML when available; an approximation otherwise). Default `NA_real_`.
`slope`	Character. Optional extra random-effect design `Z`. With `"subject"` a single random slope is added (one column in `Z`); with `"method"` one column per method level is added; with `"custom"` you provide `slope_Z` directly. Default `c("none","subject","method","custom")`.
`slope_var`	For `slope %in% c("subject","method")`, a character string giving the name of a column in `data` used as the slope regressor (e.g., centered time). It is looked up inside `data`; do not pass the vector itself. NAs are treated as zeros in `Z`.
`slope_Z`	For `slope = "custom"`, a numeric matrix with `n` rows (same order as `data`) providing the full extra random-effect design `Z`. Each column of `slope_Z` has its own variance component `\sigma_{Z,j}^2`; columns are treated as uncorrelated (diagonal block in `G`). Ignored otherwise.
`drop_zero_cols`	Logical. When `slope = "method"`, drop all-zero columns of `Z` after subsetting (useful in pairwise fits). Default `TRUE`.
`vc_select`	Character scalar; one of `c("auto","none")`. Controls how the subject by method `\sigma^2_{A\times M}` ("subj_method") and subject by time `\sigma^2_{A\times T}` ("subj_time") variance components are included. If `"auto"` (default), the function performs boundary-aware REML likelihood-ratio tests (LRTs; null on the boundary at zero with a half-`\chi^2_1` reference) to decide whether to retain each component, in the order given by `vc_test_order`. If `"none"`, no testing is done and inclusion is taken from `include_subj_method`/`include_subj_time` (or, if `NULL`, from the mere presence of the corresponding factor in the design). In pairwise fits, the decision is made independently for each method pair.
`vc_alpha`	Numeric scalar in `(0,1)`; default `0.05`. Per-component significance level for the boundary-aware REML LRTs used when `vc_select = "auto"`. The tests are one-sided for variance components on the boundary and are not multiplicity-adjusted.
`vc_test_order`	Character vector (length 2) with a permutation of `c("subj_time","subj_method")`; default `c("subj_time","subj_method")`. Specifies the order in which the two variance components are tested when `vc_select = "auto"`. The component tested first may be dropped before testing the second. If a factor is absent in the design (e.g., no time factor so "subj_time" is undefined), the corresponding test is skipped.
`include_subj_method`, `include_subj_time`	Logical scalars or `NULL`. When `vc_select = "none"`, these control whether the `\sigma^2_{A\times M}` ("subj_method") and `\sigma^2_{A\times T}` ("subj_time") random effects are included (`TRUE`) or excluded (`FALSE`) in the model. If `NULL` (default), inclusion defaults to the presence of the corresponding factor in the data (i.e., at least two method/time levels). When `vc_select = "auto"`, these arguments are ignored (automatic selection is used instead).
`sb_zero_tol`	Non-negative numeric scalar; default `1e-10`. Numerical threshold for the fixed-effect dispersion term `S_B`. After computing `\widehat{S_B}` and its delta-method variance, if `\widehat{S_B} \le` `sb_zero_tol` or non-finite, the procedure treats `S_B` as fixed at zero in the delta step.
`x`	For `print()`, an object returned by `icc_rm_reml()` or `summary.icc_rm_reml()`.
`ci_digits`	Integer; number of digits for confidence interval bounds in printed method summaries.
`n`	Optional row threshold for compact preview output.
`topn`	Optional number of leading/trailing rows to show when truncated.
`max_vars`	Optional maximum number of visible columns; `NULL` derives this from console width.
`width`	Optional display width; defaults to `getOption("width")`.
`show_ci`	One of `"yes"` or `"no"`.
`...`	Passed to the underlying display helpers.
`object`	For `summary()`, an object returned by `icc_rm_reml()`.

Details

The repeated-measures model is fit separately for each method pair using the same REML and Woodbury-identity backend used by the repeated-measures concordance estimator.

Kernel D_m and the fixed-bias term S_B. Let d = L^\top \hat\beta stack the within-time, pairwise method differences, grouped by time. The symmetric positive semidefinite kernel D_m \succeq 0 selects which functional of the bias profile is targeted by S_B. Internally, the code rescales any supplied or constructed D_m to satisfy 1^\top D_m 1 = n_t for stability and comparability.

Dmat_type = "time-avg" targets the square of the time-averaged bias.
Dmat_type = "typical-visit" targets the average of squared per-time biases.
Dmat_type = "weighted-avg" targets the square of a weighted time average.
Dmat_type = "weighted-sq" targets the weighted average of squared per-time biases.

As in the repeated-measures concordance implementation, S_B is the fitted fixed-effect dispersion term induced by D_m. It enters the denominator only for type = "agreement".

Time-averaging and shrinkage factors. The fitted variance components reported in the summary are \sigma_S^2, \sigma_{S\times M}^2, \sigma_{S\times T}^2, and \sigma_E^2, stored respectively as sigma2_subject, sigma2_subject_method, sigma2_subject_time, and sigma2_error.

The repeated-measures ICC always uses \sigma_S^2 in the numerator only. The denominator uses the same time-averaging logic already used by the repeated-measures concordance backend, through two pair-specific factors \bar{\kappa}_g and \bar{\kappa}_e.

If time is absent, the implementation sets

\bar{\kappa}_g = 0, \qquad \bar{\kappa}_e = 1.

If time is present and the target is a single visit (Dmat_type %in% c("typical-visit", "weighted-sq")), the implementation leaves the time-varying terms unshrunk:

\bar{\kappa}_g = 1, \qquad \bar{\kappa}_e = 1.

If time is present and the target is time-averaged (Dmat_type %in% c("time-avg", "weighted-avg")), then for each observed subject-method unit with T distinct observed visits:

with equal weights,

\kappa_g = \frac{1}{T}, \qquad \kappa_e = \frac{T + 2\sum_{h=1}^{T-1}(T-h)\rho^h}{T^2},

with \kappa_e=\kappa_g when residuals are iid;
with normalized visit weights w_1,\ldots,w_T,

\kappa_g = \sum_{t=1}^{T} w_t^2, \qquad \kappa_e = \sum_{t=1}^{T}\sum_{s=1}^{T} w_t w_s \rho^{|t-s|},

with \kappa_e=\kappa_g when residuals are iid.

With unbalanced T, the implementation averages the per-unit \kappa values across the observations contributing to the pair and then clamps both \bar{\kappa}_g and \bar{\kappa}_e to [10^{-12},\,1] for numerical stability.

Repeated-measures ICC. Let \sigma_{S\times M,\mathrm{eff}}^2 denote the effective subject-by-method variance term, equal to \sigma_{S\times M}^2 when the subject-by-method random effect is included and 0 otherwise. Let \sigma_{S\times T,\mathrm{eff}}^2 denote the effective subject-by-time variance term, equal to \sigma_{S\times T}^2 when the subject-by-time random effect is included and 0 otherwise.

For type = "consistency", the reported ICC is

\mathrm{ICC}_{\mathrm{consistency}} \;=\; \frac{\sigma_S^2}{ \sigma_S^2 + \sigma_{S\times M,\mathrm{eff}}^2 + \bar{\kappa}_g\,\sigma_{S\times T,\mathrm{eff}}^2 + \bar{\kappa}_e\,\sigma_E^2}.

For type = "agreement", the denominator additionally includes S_B:

\mathrm{ICC}_{\mathrm{agreement}} \;=\; \frac{\sigma_S^2}{ \sigma_S^2 + \sigma_{S\times M,\mathrm{eff}}^2 + \bar{\kappa}_g\,\sigma_{S\times T,\mathrm{eff}}^2 + \bar{\kappa}_e\,\sigma_E^2 + S_B}.

This differs from repeated-measures concordance because ICC uses only \sigma_S^2 in the numerator, whereas the concordance numerator also includes the time-averaged subject-time term. Extra random-effect variances \{\sigma_{Z,j}^2\} from slope / slope_Z are estimated by the shared backend but are not included in the ICC denominator.

CIs / SEs (delta method for ICC). Confidence intervals are built from the same REML fit by a large-sample delta method. If ci_mode = "raw", a Wald interval is formed on the ICC scale,

\widehat{\mathrm{ICC}} \;\pm\; z_{1-\alpha/2}\, \widehat{\mathrm{se}}\{\widehat{\mathrm{ICC}}\},

and truncated to [0,1]. If ci_mode = "logit", the same Wald construction is applied on the logit scale and then back-transformed. If ci_mode = "auto", the backend selects between the raw-scale and logit-scale interval per estimate.

Choosing \rho for AR(1). When ar="ar1" and ar_rho = NA, \rho is estimated by profiling the REML log-likelihood at (\hat\beta,\hat G,\hat\sigma_E^2). With very few visits per subject, \rho can be weakly identified; consider sensitivity checks over a plausible range.

Value

A repeated-measures pairwise ICC object. Without confidence intervals the result is a symmetric matrix of class c("icc_rm_reml", "icc", "matrix"). With confidence intervals it is a list with est, lwr.ci, and upr.ci and class c("icc_rm_reml", "icc_ci", "icc"). Both carry the fitted variance-component matrices as attributes.

Notes on stability and performance

All per-subject solves are r\times r with r = 1 + n_m + n_t + q_Z, so cost scales with the number of subjects and the fixed-effects dimension rather than the total number of observations. Solvers use symmetric positive-definite paths with a small diagonal ridge and pseudo-inverse fallback, which helps for very small or unbalanced subsets and near-boundary estimates. For AR(1), observations are ordered by time within subject; NA time codes break the run, and gaps between factor levels are treated as regular steps.

Heteroscedastic slopes across Z columns are supported. Each Z column has its own variance component \sigma_{Z,j}^2, but cross-covariances among Z columns are set to zero.

Threading and BLAS guards

The C++ backend uses OpenMP loops while also forcing vendor BLAS libraries to run single-threaded so that overall CPU usage stays predictable. On OpenBLAS and Apple's Accelerate this is handled automatically. On Intel MKL builds the guard is disabled by default, but you can also opt out manually by setting MATRIXCORR_DISABLE_BLAS_GUARD=1 in the environment before loading the package.

References

Shrout PE, Fleiss JL (1979). Intraclass correlations: uses in assessing rater reliability. Psychological Bulletin, 86(2), 420-428.

McGraw KO, Wong SP (1996). Forming inferences about some intraclass correlation coefficients. Psychological Methods, 1(1), 30-46.

Examples

set.seed(321)
n_id <- 20
n_time <- 3
id <- factor(rep(seq_len(n_id), each = 2 * n_time))
method <- factor(rep(rep(c("A", "B"), each = n_time), times = n_id))
time <- factor(rep(seq_len(n_time), times = 2 * n_id))

subj <- rnorm(n_id, sd = 1)[as.integer(id)]
subj_method <- rnorm(n_id * 2, sd = 0.25)
sm <- subj_method[(as.integer(id) - 1L) * 2L + as.integer(method)]
y <- subj + sm + 0.3 * (method == "B") + rnorm(length(id), sd = 0.35)

dat_rm <- data.frame(y = y, id = id, method = method, time = time)

fit_icc_rm <- icc_rm_reml(
  dat_rm,
  response = "y",
  subject = "id",
  method = "method",
  time = "time",
  type = "consistency"
)
print(fit_icc_rm)
summary(fit_icc_rm)

matrixCorr documentation built on April 18, 2026, 5:06 p.m.