interrater_agreement_table: Build an inter-rater agreement report
In mfrmr: Estimation and Diagnostics for Many-Facet Measurement Models

interrater_agreement_table

R Documentation

Build an inter-rater agreement report

Description

Build an inter-rater agreement report

Usage

interrater_agreement_table(
  fit,
  diagnostics = NULL,
  rater_facet = NULL,
  context_facets = NULL,
  exact_warn = 0.5,
  corr_warn = 0.3,
  include_precision = TRUE,
  top_n = NULL
)

Arguments

`fit`	Output from `fit_mfrm()`.
`diagnostics`	Optional output from `diagnose_mfrm()`.
`rater_facet`	Name of the rater facet. If `NULL`, inferred from facet names.
`context_facets`	Optional context facets used to match observations for agreement. If `NULL`, all remaining facets (including `Person`) are used.
`exact_warn`	Warning threshold for exact agreement.
`corr_warn`	Warning threshold for pairwise correlation.
`include_precision`	If `TRUE`, append rater severity spread indices from the facet precision summary when available.
`top_n`	Optional maximum number of pair rows to keep.

Details

This helper computes pairwise rater agreement on matched contexts and returns both a pair-level table and a one-row summary. The output is package-native and does not require knowledge of legacy report numbering.

Value

A named list with:

summary: one-row inter-rater summary
pairs: pair-level agreement table
settings: applied options and thresholds

Interpreting output

summary: overall agreement level, number/share of flagged pairs.
pairs: pairwise exact agreement, correlation, and direction/size gaps.
settings: applied facet matching and warning thresholds.

Pairs flagged by both low exact agreement and low correlation generally deserve highest calibration priority.

Typical workflow

Run with explicit rater_facet (and context_facets if needed).
Review summary(ir) and top flagged rows in ir$pairs.
Visualize with plot_interrater_agreement().

Output columns

The pairs data.frame contains:

Rater1, Rater2: Rater pair identifiers.
N: Number of matched-context observations for this pair.
Exact: Proportion of exact score agreements.
ExpectedExact: Expected exact agreement under chance.
Adjacent: Proportion of adjacent (+/- 1 category) agreements.
MeanDiff: Signed mean score difference (Rater1 - Rater2).
MAD: Mean absolute score difference.
Corr: Pearson correlation between paired scores.
Flag: Logical; TRUE when Exact < exact_warn or Corr < corr_warn.
OpportunityCount, ExactCount, ExpectedExactCount, AdjacentCount: Raw counts behind the agreement proportions.

The summary data.frame contains:

RaterFacet: Name of the rater facet analyzed.
TotalPairs: Number of rater pairs evaluated.
ExactAgreement: Mean exact agreement across all pairs.
AgreementMinusExpected: Observed exact agreement minus expected exact agreement.
MeanCorr: Mean pairwise correlation.
FlaggedPairs, FlaggedShare: Count and proportion of flagged pairs.
RaterSeparation, RaterReliability: Severity-spread indices for the rater facet, reported separately from agreement.

Examples

toy <- load_mfrmr_data("example_core")
fit <- fit_mfrm(toy, "Person", c("Rater", "Criterion"), "Score", method = "JML", maxit = 30)
ir <- interrater_agreement_table(fit, rater_facet = "Rater")
# One-row overview: ExactAgreement, ExpectedExactAgreement, MeanCorr,
# RaterSeparation, and RaterReliability are the headline reportable
# statistics.
ir$summary
# Per-pair detail (Rater1 vs Rater2 with Exact, Adjacent, Corr, MAD).
head(ir$pairs)
p_ir <- plot(ir, draw = FALSE)
p_ir$data$plot

mfrmr documentation built on June 13, 2026, 1:07 a.m.