| ccc_rm_reml | R Documentation |
Compute Lin's Concordance Correlation Coefficient (CCC) from a linear
mixed-effects model fitted by REML. The fixed-effects part can include
method and/or time (optionally their interaction), with a
subject-specific random intercept to capture between-subject variation.
Large n \times n inversions are avoided by solving small per-subject
systems.
Assumption: time levels are treated as regular, equally spaced
visits indexed by their order within subject. The AR(1) residual model is
in discrete time on the visit index (not calendar time). NA time codes
break the serial run. Gaps in the factor levels are ignored (adjacent
observed visits are treated as lag-1).
ccc_rm_reml(
data,
response,
subject,
method = NULL,
time = NULL,
ci = FALSE,
conf_level = 0.95,
n_threads = getOption("matrixCorr.threads", 1L),
ci_mode = c("auto", "raw", "logit"),
verbose = FALSE,
digits = 4,
use_message = TRUE,
interaction = FALSE,
max_iter = 100,
tol = 1e-06,
Dmat = NULL,
Dmat_type = c("time-avg", "typical-visit", "weighted-avg", "weighted-sq"),
Dmat_weights = NULL,
Dmat_rescale = TRUE,
ar = c("none", "ar1"),
ar_rho = NA_real_,
slope = c("none", "subject", "method", "custom"),
slope_var = NULL,
slope_Z = NULL,
drop_zero_cols = TRUE,
vc_select = c("auto", "none"),
vc_alpha = 0.05,
vc_test_order = c("subj_time", "subj_method"),
include_subj_method = NULL,
include_subj_time = NULL,
sb_zero_tol = 1e-10
)
data |
A data frame. |
response |
Character. Response variable name. |
subject |
Character. Subject ID variable name. |
method |
Character or |
time |
Character or |
ci |
Logical. If |
conf_level |
Numeric in |
n_threads |
Integer |
ci_mode |
Character scalar; one of |
verbose |
Logical. If |
digits |
Integer |
use_message |
Logical. When |
interaction |
Logical. Include |
max_iter |
Integer. Maximum iterations for variance-component updates
(default |
tol |
Numeric. Convergence tolerance on parameter change
(default |
Dmat |
Optional |
Dmat_type |
Character, one of
Pick |
Dmat_weights |
Optional numeric weights |
Dmat_rescale |
Logical. When |
ar |
Character. Residual correlation structure: |
ar_rho |
Numeric in |
slope |
Character. Optional extra random-effect design |
slope_var |
For |
slope_Z |
For |
drop_zero_cols |
Logical. When |
vc_select |
Character scalar; one of |
vc_alpha |
Numeric scalar in |
vc_test_order |
Character vector (length 2) with a permutation of
|
include_subj_method, include_subj_time |
Logical scalars or |
sb_zero_tol |
Non-negative numeric scalar; default |
For measurement y_{ij} on subject i under fixed
levels (method, time), we fit
y = X\beta + Zu + \varepsilon,\qquad
u \sim N(0,\,G),\ \varepsilon \sim N(0,\,R).
Notation: m subjects, n=\sum_i n_i total rows;
nm method levels; nt time levels; q_Z extra
random-slope columns (if any); r=1+nm+nt (or 1+nm+nt+q_Z with slopes).
Here Z is the subject-structured random-effects design and G is
block-diagonal at the subject level with the following per-subject
parameterisation. Specifically,
one random intercept with variance \sigma_A^2;
optionally, method deviations (one column per method level)
with a common variance \sigma_{A\times M}^2 and zero
covariances across levels (i.e., multiple of an identity);
optionally, time deviations (one column per time level)
with a common variance \sigma_{A\times T}^2 and zero
covariances across levels;
optionally, an extra random effect aligned with Z
(random slope), where each column has its own variance
\sigma_{Z,j}^2 and columns are uncorrelated.
The fixed-effects design is ~ 1 + method + time and, if
interaction=TRUE, + method:time.
Residual correlation R (regular, equally spaced time).
Write R_i=\sigma_E^2\,C_i(\rho). With ar="none", C_i=I.
With ar="ar1", within-subject residuals follow a discrete AR(1)
process along the visit index after sorting by increasing time level. Ties
retain input order, and any NA time code breaks the series so each
contiguous block of non-NA times forms a run. The correlation
between adjacent observed visits in a run is \rho; we do not use
calendar-time gaps. Internally we work with the precision of the AR(1)
correlation: for a run of length L\ge 2, the tridiagonal inverse has
(C^{-1})_{11}=(C^{-1})_{LL}=\frac{1}{1-\rho^2},\quad
(C^{-1})_{tt}=\frac{1+\rho^2}{1-\rho^2}\ (2\le t\le L-1),\quad
(C^{-1})_{t,t+1}=(C^{-1})_{t+1,t}=\frac{-\rho}{1-\rho^2}.
The working inverse is \,R_i^{-1}=\sigma_E^{-2}\,C_i(\rho)^{-1}.
Per-subject Woodbury system. For subject i with n_i
rows, define the per-subject random-effects design U_i (columns:
intercept, method indicators, time indicators; dimension
\,r=1+nm+nt\,). The core never forms
V_i = R_i + U_i G U_i^\top explicitly. Instead,
M_i \;=\; G^{-1} \;+\; U_i^\top R_i^{-1} U_i,
and accumulates GLS blocks via rank-r corrections using
\,V_i^{-1} = R_i^{-1}-R_i^{-1}U_i M_i^{-1}U_i^\top R_i^{-1}\,:
X^\top V^{-1} X \;=\; \sum_i \Big[
X_i^\top R_i^{-1}X_i \;-\; (X_i^\top R_i^{-1}U_i)\,
M_i^{-1}\,(U_i^\top R_i^{-1}X_i) \Big],
X^\top V^{-1} y \;=\; \sum_i \Big[
X_i^\top R_i^{-1}y_i \;-\; (X_i^\top R_i^{-1}U_i)\,M_i^{-1}\,
(U_i^\top R_i^{-1}y_i) \Big].
Because G^{-1} is diagonal with positive entries, each M_i is
symmetric positive definite; solves/inversions use symmetric-PD routines with
a small diagonal ridge and a pseudo-inverse if needed.
Random-slope Z.
Besides U_i, the function can include an extra design Z_i.
slope="subject": Z has one column (the regressor in
slope_var); Z_{i} is the subject-i block, with its own
variance \sigma_{Z,1}^2.
slope="method": Z has one column per method level;
row t uses the slope regressor if its method equals level \ell,
otherwise 0; all-zero columns can be dropped via
drop_zero_cols=TRUE after subsetting. Each column has its own
variance \sigma_{Z,\ell}^2.
slope="custom": Z is provided fully via slope_Z.
Each column is an independent random effect with its own variance
\sigma_{Z,j}^2; cross-covariances among columns are set to 0.
Computations simply augment \tilde U_i=[U_i\ Z_i] and the corresponding
inverse-variance block. The EM updates then include, for each column j=1,\ldots,q_Z,
\sigma_{Z,j}^{2\,(new)} \;=\; \frac{1}{m}
\sum_{i=1}^m \Big( b_{i,\text{extra},j}^2 +
(M_i^{-1})_{\text{extra},jj} \Big)
\quad (\text{if } q_Z>0).
Interpretation: the \sigma_{Z,j}^2 represent additional within-subject
variability explained by the slope regressor(s) in column j and are not part of the CCC
denominator (agreement across methods/time).
EM-style variance-component updates. With current \hat\beta,
form residuals r_i = y_i - X_i\hat\beta. The BLUPs and conditional
covariances are
b_i \;=\; M_i^{-1}\,(U_i^\top R_i^{-1} r_i), \qquad
\mathrm{Var}(b_i\mid y) \;=\; M_i^{-1}.
Let e_i=r_i-U_i b_i. Expected squares then yield closed-form updates:
\sigma_A^{2\,(new)} \;=\; \frac{1}{m}\sum_i \Big( b_{i,0}^2 +
(M_i^{-1})_{00} \Big),
\sigma_{A\times M}^{2\,(new)} \;=\; \frac{1}{m\,nm}
\sum_i \sum_{\ell=1}^{nm}\!\Big( b_{i,\ell}^2 +
(M_i^{-1})_{\ell\ell} \Big)
\quad (\text{if } nm>0),
\sigma_{A\times T}^{2\,(new)} \;=\; \frac{1}{m\,nt}
\sum_i \sum_{t=1}^{nt}
\Big( b_{i,t}^2 + (M_i^{-1})_{tt} \Big)
\quad (\text{if } nt>0),
\sigma_E^{2\,(new)} \;=\; \frac{1}{n} \sum_i
\Big( e_i^\top C_i(\rho)^{-1} e_i \;+\;
\mathrm{tr}\!\big(M_i^{-1}U_i^\top C_i(\rho)^{-1} U_i\big) \Big),
together with the per-column update for \sigma_{Z,j}^2 given above.
Iterate until the \ell_1 change across components is <tol
or max_iter is reached.
Fixed-effect dispersion S_B: choosing the time-kernel D_m.
Let d = L^\top \hat\beta stack the within-time, pairwise method differences,
grouped by time as d=(d_1^\top,\ldots,d_{n_t}^\top)^\top with
d_t \in \mathbb{R}^{P} and P = n_m(n_m-1). The symmetric
positive semidefinite kernel D_m \succeq 0 selects which functional of the
bias profile t \mapsto d_t is targeted by S_B. Internally, the code
rescales any supplied/built D_m to satisfy 1^\top D_m 1 = n_t for
stability and comparability.
Dmat_type = "time-avg" (square of the time-averaged bias).
Let
w \;=\; \frac{1}{n_t}\,\mathbf{1}_{n_t}, \qquad
D_m \;\propto\; I_P \otimes (w\,w^\top),
so that
d^\top D_m d \;\propto\; \sum_{p=1}^{P}
\left( \frac{1}{n_t}\sum_{t=1}^{n_t} d_{t,p} \right)^{\!2}.
Methods have equal \textit{time-averaged}
means within subject, i.e. \sum_{t=1}^{n_t} d_{t,p}/n_t = 0 for all
p. Appropriate when decisions depend on an average over time and
opposite-signed biases are allowed to cancel.
Dmat_type = "typical-visit" (average of squared per-time biases).
With equal visit probability, take
D_m \;\propto\; I_P \otimes
\mathrm{diag}\!\Big(\tfrac{1}{n_t},\ldots,\tfrac{1}{n_t}\Big),
yielding
d^\top D_m d \;\propto\; \frac{1}{n_t}
\sum_{t=1}^{n_t}\sum_{p=1}^{P} d_{t,p}^{\,2}.
Methods agree on a \textit{typical}
occasion drawn uniformly from the visit set. Use when each visit matters
on its own; alternating signs d_{t,p} do not cancel.
Dmat_type = "weighted-avg" (square of a weighted time average).
For user weights a=(a_1,\ldots,a_{n_t})^\top with a_t \ge 0, set
w \;=\; \frac{a}{\sum_{t=1}^{n_t} a_t}, \qquad
D_m \;\propto\; I_P \otimes (w\,w^\top),
so that
d^\top D_m d \;\propto\; \sum_{p=1}^{P}
\left( \sum_{t=1}^{n_t} w_t\, d_{t,p} \right)^{\!2}.
Methods have equal \textit{weighted
time-averaged} means, i.e. \sum_{t=1}^{n_t} w_t\, d_{t,p} = 0 for all
p. Use when some visits (e.g., baseline/harvest) are a priori more
influential; opposite-signed biases may cancel according to w.
Dmat_type = "weighted-sq" (weighted average of squared per-time biases).
With the same weights w, take
D_m \;\propto\; I_P \otimes \mathrm{diag}(w_1,\ldots,w_{n_t}),
giving
d^\top D_m d \;\propto\; \sum_{t=1}^{n_t} w_t
\sum_{p=1}^{P} d_{t,p}^{\,2}.
Methods agree at visits sampled with
probabilities \{w_t\}, counting each visit's discrepancy on its own.
Use when per-visit agreement is required but some visits should be
emphasised more than others.
Time-averaging for CCC (regular visits).
The reported CCC targets agreement of the time-averaged measurements
per method within subject by default (Dmat_type="time-avg"). Averaging over T
non-NA visits shrinks time-varying components by
\kappa_T^{(g)} \;=\; 1/T, \qquad
\kappa_T^{(e)} \;=\; \{T + 2\sum_{k=1}^{T-1}(T-k)\rho^k\}/T^2,
with \kappa_T^{(e)}=1/T when residuals are i.i.d. With unbalanced T, the
implementation averages the per-(subject,method) \kappa values across the
pairs contributing to CCC and then clamps \kappa_T^{(e)} to
[10^{-12},\,1] for numerical stability. Choosing
Dmat_type="typical-visit" makes S_B match the interpretation of a
randomly sampled occasion instead.
Concordance correlation coefficient. The CCC used is
\mathrm{CCC} \;=\;
\frac{\sigma_A^2 + \kappa_T^{(g)}\,\sigma_{A\times T}^2}
{\sigma_A^2 + \sigma_{A\times M}^2 +
\kappa_T^{(g)}\,\sigma_{A\times T}^2 + S_B +
\kappa_T^{(e)}\,\sigma_E^2}.
Special cases: with no method factor, S_B=\sigma_{A\times M}^2=0; with
a single time level, \sigma_{A\times T}^2=0 (no \kappa-shrinkage).
When T=1 or \rho=0, both \kappa-factors equal 1. The extra
random-effect variances \{\sigma_{Z,j}^2\} (if used) are not included.
CIs / SEs (delta method for CCC). Let
\theta \;=\; \big(\sigma_A^2,\ \sigma_{A\times M}^2,\
\sigma_{A\times T}^2,\ \sigma_E^2,\ S_B\big)^\top,
and write \mathrm{CCC}(\theta)=N/D with
N=\sigma_A^2+\kappa_T^{(g)}\sigma_{A\times T}^2 and
D=\sigma_A^2+\sigma_{A\times M}^2+\kappa_T^{(g)}\sigma_{A\times T}^2+S_B+\kappa_T^{(e)}\sigma_E^2.
The gradient components are
\frac{\partial\,\mathrm{CCC}}{\partial \sigma_A^2}
\;=\; \frac{\sigma_{A\times M}^2 + S_B + \kappa_T^{(e)}\sigma_E^2}{D^2},
\frac{\partial\,\mathrm{CCC}}{\partial \sigma_{A\times M}^2}
\;=\; -\,\frac{N}{D^2}, \qquad
\frac{\partial\,\mathrm{CCC}}{\partial \sigma_{A\times T}^2}
\;=\; \frac{\kappa_T^{(g)}\big(\sigma_{A\times M}^2 + S_B +
\kappa_T^{(e)}\sigma_E^2\big)}{D^2},
\frac{\partial\,\mathrm{CCC}}{\partial \sigma_E^2}
\;=\; -\,\frac{\kappa_T^{(e)}\,N}{D^2}, \qquad
\frac{\partial\,\mathrm{CCC}}{\partial S_B}
\;=\; -\,\frac{N}{D^2}.
Estimating \mathrm{Var}(\hat\theta).
The EM updates write each variance component as an average of per-subject
quantities. For subject i,
t_{A,i} \;=\; b_{i,0}^2 + (M_i^{-1})_{00},\qquad
t_{M,i} \;=\; \frac{1}{nm}\sum_{\ell=1}^{nm}
\Big(b_{i,\ell}^2 + (M_i^{-1})_{\ell\ell}\Big),
t_{T,i} \;=\; \frac{1}{nt}\sum_{j=1}^{nt}
\Big(b_{i,j}^2 + (M_i^{-1})_{jj}\Big),\qquad
s_i \;=\; \frac{e_i^\top C_i(\rho)^{-1} e_i +
\mathrm{tr}\!\big(M_i^{-1}U_i^\top C_i(\rho)^{-1} U_i\big)}{n_i},
where b_i = M_i^{-1}(U_i^\top R_i^{-1} r_i) and
M_i = G^{-1} + U_i^\top R_i^{-1} U_i.
With m subjects, form the empirical covariance of the stacked
subject vectors and scale by m to approximate the covariance of the
means:
\widehat{\mathrm{Cov}}\!\left(
\begin{bmatrix} t_{A,\cdot} \\ t_{M,\cdot} \\ t_{T,\cdot} \end{bmatrix}
\right)
\;\approx\; \frac{1}{m}\,
\mathrm{Cov}_i\!\left(
\begin{bmatrix} t_{A,i} \\ t_{M,i} \\ t_{T,i} \end{bmatrix}\right).
(Drop rows/columns as needed when nm==0 or nt==0.)
The residual variance estimator is a weighted mean
\hat\sigma_E^2=\sum_i w_i s_i with w_i=n_i/n. Its variance is
approximated by the variance of a weighted mean of independent terms,
\widehat{\mathrm{Var}}(\hat\sigma_E^2)
\;\approx\; \Big(\sum_i w_i^2\Big)\,\widehat{\mathrm{Var}}(s_i),
where \widehat{\mathrm{Var}}(s_i) is the sample variance across
subjects. The method-dispersion term uses the quadratic-form delta already
computed for S_B:
\widehat{\mathrm{Var}}(S_B)
\;=\; \frac{2\,\mathrm{tr}\!\big((A_{\!fix}\,\mathrm{Var}(\hat\beta))^2\big)
\;+\; 4\,\hat\beta^\top A_{\!fix}\,\mathrm{Var}(\hat\beta)\,
A_{\!fix}\,\hat\beta}
{\big[nm\,(nm-1)\,\max(nt,1)\big]^2},
with A_{\!fix}=L\,D_m\,L^\top.
Putting it together. Assemble
\widehat{\mathrm{Var}}(\hat\theta) by combining the
(\sigma_A^2,\sigma_{A\times M}^2,\sigma_{A\times T}^2) covariance
block from the subject-level empirical covariance, add the
\widehat{\mathrm{Var}}(\hat\sigma_E^2) and
\widehat{\mathrm{Var}}(S_B) terms on the diagonal,
and ignore cross-covariances across these blocks (a standard large-sample
simplification). Then
\widehat{\mathrm{se}}\{\widehat{\mathrm{CCC}}\}
\;=\; \sqrt{\,\nabla \mathrm{CCC}(\hat\theta)^\top\,
\widehat{\mathrm{Var}}(\hat\theta)\,
\nabla \mathrm{CCC}(\hat\theta)\,}.
A two-sided (1-\alpha) normal CI is
\widehat{\mathrm{CCC}} \;\pm\; z_{1-\alpha/2}\,
\widehat{\mathrm{se}}\{\widehat{\mathrm{CCC}}\},
truncated to [0,1] in the output for convenience. When S_B is
truncated at 0 or samples are very small/imbalanced, the normal CI can be
mildly anti-conservative near the boundary; a logit transform for CCC or a
subject-level (cluster) bootstrap can be used for sensitivity analysis.
Choosing \rho for AR(1).
When ar="ar1" and ar_rho = NA, \rho is estimated by
profiling the REML log-likelihood at (\hat\beta,\hat G,\hat\sigma_E^2).
With very few visits per subject, \rho can be weakly identified; consider
sensitivity checks over a plausible range.
All per-subject solves are \,r\times r with r=1+nm+nt+q_Z, so cost
scales with the number of subjects and the fixed-effects dimension rather
than the total number of observations. Solvers use symmetric-PD paths with
a small diagonal ridge and pseudo-inverse,
which helps for very small/unbalanced subsets and near-boundary estimates.
For AR(1), observations are ordered by time within subject; NA time codes
break the run, and gaps between factor levels are treated as regular steps
(elapsed time is not used).
Heteroscedastic slopes across Z columns are supported.
Each Z column has its own variance component \sigma_{Z,j}^2, but
cross-covariances among Z columns are set to zero (diagonal block). Column
rescaling changes the implied prior on b_{i,\text{extra}} but does not
introduce correlations.
The C++ backend uses OpenMP loops while also forcing vendor BLAS libraries to
run single-threaded so that overall CPU usage stays predictable. On OpenBLAS
and Apple's Accelerate this is handled automatically. On Intel MKL builds the
guard is disabled by default, but you can also opt out manually by setting
MATRIXCORR_DISABLE_BLAS_GUARD=1 in the environment before loading
matrixCorr.
Internally, the call is routed to ccc_lmm_reml_pairwise(), which fits one
repeated-measures mixed model per pair of methods. Each model includes:
subject random intercepts (always)
optional subject-by-method (sigma^2_{A \times M}) and
subject-by-time (sigma^2_{A \times T}) variance components
optional random slopes specified via slope/slope_var/slope_Z
residual structure ar = "none" (iid) or ar = "ar1"
D-matrix options (Dmat_type, Dmat, Dmat_weights) control how time
averaging operates when translating variance components into CCC summaries.
Thiago de Paula Oliveira
Lin L (1989). A concordance correlation coefficient to evaluate reproducibility. Biometrics, 45: 255-268.
Lin L (2000). A note on the concordance correlation coefficient. Biometrics, 56: 324-325.
Carrasco, J. L. et al. (2013). Estimation of the concordance correlation coefficient for repeated measures using SAS and R. Computer Methods and Programs in Biomedicine, 109(3), 293-304. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1016/j.cmpb.2012.09.002")}
King et al. (2007). A Class of Repeated Measures Concordance Correlation Coefficients. Journal of Biopharmaceutical Statistics, 17(4). \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1080/10543400701329455")}
build_L_Dm_Z_cpp
for constructing L/D_m/Z; ccc_rm_ustat
for a U-statistic alternative; and cccrm for a reference approach via
nlme.
# ====================================================================
# 1) Subject x METHOD variance present, no time
# y_{i,m} = mu + b_m + u_i + w_{i,m} + e_{i,m}
# with u_i ~ N(0, s_A^2), w_{i,m} ~ N(0, s_{AxM}^2)
# ====================================================================
set.seed(102)
n_subj <- 60
n_time <- 8
id <- factor(rep(seq_len(n_subj), each = 2 * n_time))
time <- factor(rep(rep(seq_len(n_time), times = 2), times = n_subj))
method <- factor(rep(rep(c("A","B"), each = n_time), times = n_subj))
sigA <- 0.6 # subject
sigAM <- 0.3 # subject x method
sigAT <- 0.5 # subject x time
sigE <- 0.4 # residual
# Expected estimate S_B = 0.2^2 = 0.04
biasB <- 0.2 # fixed method bias
# random effects
u_i <- rnorm(n_subj, 0, sqrt(sigA))
u <- u_i[as.integer(id)]
sm <- interaction(id, method, drop = TRUE)
w_im_lv <- rnorm(nlevels(sm), 0, sqrt(sigAM))
w_im <- w_im_lv[as.integer(sm)]
st <- interaction(id, time, drop = TRUE)
g_it_lv <- rnorm(nlevels(st), 0, sqrt(sigAT))
g_it <- g_it_lv[as.integer(st)]
# residuals & response
e <- rnorm(length(id), 0, sqrt(sigE))
y <- (method == "B") * biasB + u + w_im + g_it + e
dat_both <- data.frame(y, id, method, time)
# Both sigma2_subject_method and sigma2_subject_time are identifiable here
fit_both <- ccc_rm_reml(dat_both, "y", "id", method = "method", time = "time",
vc_select = "auto", verbose = TRUE)
summary(fit_both)
plot(fit_both)
# Interactive viewing (requires shiny)
if (interactive() && requireNamespace("shiny", quietly = TRUE)) {
view_corr_shiny(fit_both)
}
# ====================================================================
# 2) Subject x TIME variance present (sag > 0) with two methods
# y_{i,m,t} = mu + b_m + u_i + g_{i,t} + e_{i,m,t}
# where g_{i,t} ~ N(0, s_{AxT}^2) shared across methods at time t
# ====================================================================
set.seed(202)
n_subj <- 60; n_time <- 14
id <- factor(rep(seq_len(n_subj), each = 2 * n_time))
method <- factor(rep(rep(c("A","B"), each = n_time), times = n_subj))
time <- factor(rep(rep(seq_len(n_time), times = 2), times = n_subj))
sigA <- 0.7
sigAT <- 0.5
sigE <- 0.5
biasB <- 0.25
u <- rnorm(n_subj, 0, sqrt(sigA))[as.integer(id)]
gIT <- rnorm(n_subj * n_time, 0, sqrt(sigAT))
g <- gIT[ (as.integer(id) - 1L) * n_time + as.integer(time) ]
y <- (method == "B") * biasB + u + g + rnorm(length(id), 0, sqrt(sigE))
dat_sag <- data.frame(y, id, method, time)
# sigma_AT should be retained; sigma_AM may be dropped (since w_{i,m}=0)
fit_sag <- ccc_rm_reml(dat_sag, "y", "id", method = "method", time = "time",
vc_select = "auto", verbose = TRUE)
summary(fit_sag)
plot(fit_sag)
# ====================================================================
# 3) BOTH components present: sab > 0 and sag > 0 (2 methods x T times)
# y_{i,m,t} = mu + b_m + u_i + w_{i,m} + g_{i,t} + e_{i,m,t}
# ====================================================================
set.seed(303)
n_subj <- 60; n_time <- 4
id <- factor(rep(seq_len(n_subj), each = 2 * n_time))
method <- factor(rep(rep(c("A","B"), each = n_time), times = n_subj))
time <- factor(rep(rep(seq_len(n_time), times = 2), times = n_subj))
sigA <- 0.8
sigAM <- 0.3
sigAT <- 0.4
sigE <- 0.5
biasB <- 0.2
u <- rnorm(n_subj, 0, sqrt(sigA))[as.integer(id)]
# (subject, method) random deviations: repeat per (i,m) across its times
wIM <- rnorm(n_subj * 2, 0, sqrt(sigAM))
w <- wIM[ (as.integer(id) - 1L) * 2 + as.integer(method) ]
# (subject, time) random deviations: shared across methods at time t
gIT <- rnorm(n_subj * n_time, 0, sqrt(sigAT))
g <- gIT[ (as.integer(id) - 1L) * n_time + as.integer(time) ]
y <- (method == "B") * biasB + u + w + g + rnorm(length(id), 0, sqrt(sigE))
dat_both <- data.frame(y, id, method, time)
fit_both <- ccc_rm_reml(dat_both, "y", "id", method = "method", time = "time",
vc_select = "auto", verbose = TRUE, ci = TRUE)
summary(fit_both)
plot(fit_both)
# If you want to force-include both VCs (skip testing):
fit_both_forced <-
ccc_rm_reml(dat_both, "y", "id", method = "method", time = "time",
vc_select = "none", include_subj_method = TRUE,
include_subj_time = TRUE, verbose = TRUE)
summary(fit_both_forced)
plot(fit_both_forced)
# ====================================================================
# 4) D_m choices: time-averaged (default) vs typical visit
# ====================================================================
# Time-average
ccc_rm_reml(dat_both, "y", "id", method = "method", time = "time",
vc_select = "none", include_subj_method = TRUE,
include_subj_time = TRUE, Dmat_type = "time-avg")
# Typical visit
ccc_rm_reml(dat_both, "y", "id", method = "method", time = "time",
vc_select = "none", include_subj_method = TRUE,
include_subj_time = TRUE, Dmat_type = "typical-visit")
# ====================================================================
# 5) AR(1) residual correlation with fixed rho (larger example)
# ====================================================================
set.seed(10)
n_subj <- 40
n_time <- 10
methods <- c("A", "B", "C", "D")
nm <- length(methods)
id <- factor(rep(seq_len(n_subj), each = n_time * nm))
method <- factor(rep(rep(methods, each = n_time), times = n_subj),
levels = methods)
time <- factor(rep(rep(seq_len(n_time), times = nm), times = n_subj))
beta0 <- 0
beta_t <- 0.2
bias_met <- c(A = 0.00, B = 0.30, C = -0.15, D = 0.05)
sigA <- 1.0
rho_true <- 0.6
sigE <- 0.7
t_num <- as.integer(time)
t_c <- t_num - mean(seq_len(n_time))
mu <- beta0 + beta_t * t_c + bias_met[as.character(method)]
u_subj <- rnorm(n_subj, 0, sqrt(sigA))
u <- u_subj[as.integer(id)]
e <- numeric(length(id))
for (s in seq_len(n_subj)) {
for (m in methods) {
idx <- which(id == levels(id)[s] & method == m)
e[idx] <- stats::arima.sim(list(ar = rho_true), n = n_time, sd = sigE)
}
}
y <- mu + u + e
dat_ar4 <- data.frame(y = y, id = id, method = method, time = time)
ccc_rm_reml(dat_ar4,
response = "y", subject = "id", method = "method", time = "time",
ar = "ar1", ar_rho = 0.6, verbose = TRUE)
# ====================================================================
# 6) Random slope variants (subject, method, custom Z)
# ====================================================================
## By SUBJECT
set.seed(2)
n_subj <- 60; n_time <- 4
id <- factor(rep(seq_len(n_subj), each = 2 * n_time))
tim <- factor(rep(rep(seq_len(n_time), times = 2), times = n_subj))
method <- factor(rep(rep(c("A","B"), each = n_time), times = n_subj))
subj <- as.integer(id)
slope_i <- rnorm(n_subj, 0, 0.15)
slope_vec <- slope_i[subj]
base <- rnorm(n_subj, 0, 1.0)[subj]
tnum <- as.integer(tim)
y <- base + 0.3*(method=="B") + slope_vec*(tnum - mean(seq_len(n_time))) +
rnorm(length(id), 0, 0.5)
dat_s <- data.frame(y, id, method, time = tim)
dat_s$t_num <- as.integer(dat_s$time)
dat_s$t_c <- ave(dat_s$t_num, dat_s$id, FUN = function(v) v - mean(v))
ccc_rm_reml(dat_s, "y", "id", method = "method", time = "time",
slope = "subject", slope_var = "t_c", verbose = TRUE)
## By METHOD
set.seed(3)
n_subj <- 60; n_time <- 4
id <- factor(rep(seq_len(n_subj), each = 2 * n_time))
tim <- factor(rep(rep(seq_len(n_time), times = 2), times = n_subj))
method <- factor(rep(rep(c("A","B"), each = n_time), times = n_subj))
slope_m <- ifelse(method=="B", 0.25, 0.00)
base <- rnorm(n_subj, 0, 1.0)[as.integer(id)]
tnum <- as.integer(tim)
y <- base + 0.3*(method=="B") + slope_m*(tnum - mean(seq_len(n_time))) +
rnorm(length(id), 0, 0.5)
dat_m <- data.frame(y, id, method, time = tim)
dat_m$t_num <- as.integer(dat_m$time)
dat_m$t_c <- ave(dat_m$t_num, dat_m$id, FUN = function(v) v - mean(v))
ccc_rm_reml(dat_m, "y", "id", method = "method", time = "time",
slope = "method", slope_var = "t_c", verbose = TRUE)
## SUBJECT + METHOD random slopes (custom Z)
set.seed(4)
n_subj <- 50; n_time <- 4
id <- factor(rep(seq_len(n_subj), each = 2 * n_time))
tim <- factor(rep(rep(seq_len(n_time), times = 2), times = n_subj))
method <- factor(rep(rep(c("A","B"), each = n_time), times = n_subj))
subj <- as.integer(id)
slope_subj <- rnorm(n_subj, 0, 0.12)[subj]
slope_B <- ifelse(method=="B", 0.18, 0.00)
tnum <- as.integer(tim)
base <- rnorm(n_subj, 0, 1.0)[subj]
y <- base + 0.3*(method=="B") +
(slope_subj + slope_B) * (tnum - mean(seq_len(n_time))) +
rnorm(length(id), 0, 0.5)
dat_bothRS <- data.frame(y, id, method, time = tim)
dat_bothRS$t_num <- as.integer(dat_bothRS$time)
dat_bothRS$t_c <- ave(dat_bothRS$t_num, dat_bothRS$id, FUN = function(v) v - mean(v))
MM <- model.matrix(~ 0 + method, data = dat_bothRS)
Z_custom <- cbind(
subj_slope = dat_bothRS$t_c,
MM * dat_bothRS$t_c
)
ccc_rm_reml(dat_bothRS, "y", "id", method = "method", time = "time",
slope = "custom", slope_Z = Z_custom, verbose = TRUE)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.