estimate_bias: Estimate legacy-compatible bias/interaction terms iteratively

View source: R/api-tables.R

estimate_biasR Documentation

Estimate legacy-compatible bias/interaction terms iteratively

Description

Estimate legacy-compatible bias/interaction terms iteratively

Usage

estimate_bias(
  fit,
  diagnostics,
  facet_a = NULL,
  facet_b = NULL,
  interaction_facets = NULL,
  max_abs = 10,
  omit_extreme = TRUE,
  max_iter = 4,
  tol = 0.001
)

Arguments

fit

Output from fit_mfrm().

diagnostics

Output from diagnose_mfrm().

facet_a

First facet name.

facet_b

Second facet name.

interaction_facets

Character vector of two or more facets to model as one interaction effect. When supplied, this takes precedence over facet_a/facet_b.

max_abs

Bound for absolute bias size.

omit_extreme

Omit extreme-only elements.

max_iter

Iteration cap.

tol

Convergence tolerance.

Details

Bias (interaction) in MFRM refers to a systematic departure from the additive model: a specific rater-criterion (or higher-order) combination produces scores that are consistently higher or lower than predicted by the main effects alone. For example, Rater A might be unexpectedly harsh on Criterion 2 despite being lenient overall.

Mathematically, the bias term b_{jc} for rater j on criterion c modifies the linear predictor:

\eta_{njc} = \theta_n - \delta_j - \beta_c - b_{jc}

The function estimates b_{jc} from the residuals of the fitted (additive) model using iterative recalibration in a legacy-compatible style (Myford & Wolfe, 2003, 2004):

b_{jc} = \frac{\sum_n (X_{njc} - E_{njc})} {\sum_n \mathrm{Var}_{njc}}

Each iteration updates expected scores using the current bias estimates, then re-computes the bias. Convergence is reached when the maximum absolute change in bias estimates falls below tol.

  • For two-way mode, use facet_a and facet_b (or interaction_facets with length 2).

  • For higher-order mode, provide interaction_facets with length >= 3.

Value

An object of class mfrm_bias with:

  • table: interaction rows with effect size, SE, screening t/p metadata, reporting-use flags, and fit columns

  • summary: compact summary statistics

  • chi_sq: fixed-effect chi-square style screening summary

  • facet_a, facet_b: first two analyzed facet names (legacy compatibility)

  • interaction_facets, interaction_order, interaction_mode: full interaction metadata

  • iteration: iteration history/metadata

What this screening means

estimate_bias() summarizes interaction departures from the additive MFRM. It is best read as a targeted screening tool for potentially noteworthy cells or facet combinations that may merit substantive review.

What this screening does not justify

  • t and Prob. are screening metrics, not formal inferential quantities.

  • A flagged interaction cell is not, by itself, proof of rater bias or construct-irrelevant variance.

  • Non-flagged cells should not be over-read as evidence that interaction effects are absent.

Interpreting output

Use summary for global magnitude, then inspect table for cell-level interaction effects.

Prioritize rows with:

  • larger ⁠|Bias Size|⁠ (effect on logit scale; > 0.5 logits is typically noteworthy, > 1.0 is large)

  • larger ⁠|t|⁠ among the screening metrics (|t| \ge 2 suggests a screen-positive interaction cell)

  • smaller Prob. among the screening metrics

A positive ⁠Obs-Exp Average⁠ means the cell produced higher scores than the additive model predicts (unexpected leniency); negative means unexpected harshness.

iteration helps verify whether iterative recalibration stabilized. If the maximum change on the final iteration is still above tol, consider increasing max_iter.

Typical workflow

  1. Fit and diagnose model.

  2. Run estimate_bias(...) for target interaction facets.

  3. Review summary(bias) and bias$table.

  4. Visualize/report via plot_bias_interaction() and build_fixed_reports().

Interpreting key output columns

In bias$table, the most-used columns are:

  • ⁠Bias Size⁠: estimated interaction effect b_{jc} (logit scale)

  • t and Prob.: screening metrics, not formal inferential quantities

  • ⁠Obs-Exp Average⁠: direction and practical size of observed-vs-expected gap on the raw-score metric

The chi_sq element provides a fixed-effect heterogeneity screen across all interaction cells.

Recommended next step

Use plot_bias_interaction() to inspect the flagged cells visually, then integrate the result with DFF, linking, or substantive scoring review before making formal claims about fairness or invariance.

See Also

build_fixed_reports(), build_apa_outputs()

Examples

toy <- load_mfrmr_data("example_bias")
fit <- fit_mfrm(toy, "Person", c("Rater", "Criterion"), "Score", method = "JML", maxit = 25)
diag <- diagnose_mfrm(fit, residual_pca = "none")
bias <- estimate_bias(fit, diag, facet_a = "Rater", facet_b = "Criterion", max_iter = 2)
summary(bias)
p_bias <- plot_bias_interaction(bias, draw = FALSE)
class(p_bias)

mfrmr documentation built on March 31, 2026, 1:06 a.m.