estimate_bias: Estimate bias and interaction screening terms
In mfrmr: Estimation and Diagnostics for Many-Facet Measurement Models

estimate_bias

R Documentation

Estimate bias and interaction screening terms

Description

Estimate bias and interaction screening terms

Usage

estimate_bias(
  fit,
  diagnostics,
  facet_a = NULL,
  facet_b = NULL,
  interaction_facets = NULL,
  max_abs = 10,
  omit_extreme = TRUE,
  max_iter = 4,
  tol = 0.001
)

Arguments

`fit`	Output from `fit_mfrm()`.
`diagnostics`	Output from `diagnose_mfrm()`.
`facet_a`	First facet name. Provide together with `facet_b` for the classic pairwise 2-way interaction. Ignored when `interaction_facets` is supplied.
`facet_b`	Second facet name. See `facet_a`.
`interaction_facets`	Character vector of two or more facets to model as one interaction effect. When supplied, this takes precedence over `facet_a`/`facet_b`. Use this form (rather than `facet_a`/`facet_b`) whenever you want 3+ way interactions, since `facet_a/facet_b` is restricted to the pairwise case.
`max_abs`	Bound for absolute bias size.
`omit_extreme`	Omit extreme-only elements.
`max_iter`	Iteration cap.
`tol`	Convergence tolerance.

Details

Bias (interaction) in MFRM refers to a systematic departure from the additive model: a specific rater-criterion (or higher-order) combination produces scores that are consistently higher or lower than predicted by the main effects alone. For example, Rater A might be unexpectedly harsh on Criterion 2 despite being lenient overall.

Mathematically, the bias term b_{jc} for rater j on criterion c modifies the linear predictor:

\eta_{njc} = \theta_n - \delta_j - \beta_c - b_{jc}

For RSM / PCM, the function estimates b_{jc} from the residuals of the fitted additive model using an iterative recalibration screen aligned with the many-facet bias literature (Myford & Wolfe, 2003, 2004):

b_{jc} = \frac{\sum_n (X_{njc} - E_{njc})} {\sum_n \mathrm{Var}_{njc}}

Each iteration updates expected scores using the current bias estimates, then re-computes the bias. Convergence is reached when the maximum absolute change in bias estimates falls below tol. For bounded GPCM, the same additive-bias idea is evaluated with the slope-aware GPCM kernel and conditional profile-likelihood follow-up columns; those quantities remain screening evidence because theta, facet, step, and slope estimates are held fixed.

For two-way mode, use facet_a and facet_b (or interaction_facets with length 2).
For higher-order mode, provide interaction_facets with length >= 3.

Value

An object of class mfrm_bias with:

table: interaction rows with effect size, SE, screening t/p metadata, reporting-use flags, fit columns, and bounded-GPCM profile-likelihood columns when available
summary: compact summary statistics
chi_sq: fixed-effect chi-square style screening summary
facet_a, facet_b: first two analyzed facet names (legacy compatibility)
interaction_facets, interaction_order, interaction_mode: full interaction metadata
iteration: iteration history/metadata
orientation_review: facet-orientation sign-consistency review table
mixed_sign: logical flag indicating whether bias-size signs flip across facets in a way that complicates direction interpretation
direction_note: one-line interpretive note describing the dominant bias direction (empty when not applicable)
recommended_action: one-line recommended-action label routing the user to the appropriate follow-up helper
inference_tier: summary label indicating that the bias rows are intended for screening and follow-up review in this release
optimization_failures: per-cell record of any inner-loop optimizer failures encountered while estimating the bias parameters; empty when every cell converged cleanly

What this screening means

estimate_bias() summarizes interaction departures from the additive MFRM. It is best read as a targeted screening tool for potentially noteworthy cells or facet combinations that may merit substantive review.

What this screening does not justify

t and Prob. are screening metrics, not formal inferential quantities.
A flagged interaction cell is not, by itself, proof of rater bias or construct-irrelevant variance.
Non-flagged cells should not be over-read as evidence that interaction effects are absent.

Interpreting output

Use summary for global magnitude, then inspect table for cell-level interaction effects.

Prioritize rows with:

larger ⁠|Bias Size|⁠ (effect on logit scale; > 0.5 logits is typically noteworthy, > 1.0 is large)
larger ⁠|t|⁠ among the screening metrics (|t| \ge 2 suggests a screen-positive interaction cell)
smaller Prob. among the screening metrics

A positive ⁠Obs-Exp Average⁠ means the cell produced higher scores than the additive model predicts (unexpected leniency); negative means unexpected harshness.

iteration helps verify whether iterative recalibration stabilized. If the maximum change on the final iteration is still above tol, consider increasing max_iter.

Typical workflow

Fit and diagnose model.
Run estimate_bias(...) for target interaction facets.
Review summary(bias) and bias$table.
Visualize/report via plot_bias_interaction() and build_fixed_reports().

Interpreting key output columns

In bias$table, the most-used columns are:

⁠Bias Size⁠: estimated interaction effect b_{jc} (logit scale)
t and Prob.: screening metrics, not formal inferential quantities
⁠Obs-Exp Average⁠: direction and practical size of observed-vs-expected gap on the raw-score metric
for bounded GPCM, ⁠LR ChiSq⁠, ⁠LR Prob.⁠, and ⁠Profile CI Lower⁠ / ⁠Profile CI Upper⁠: conditional profile-likelihood checks for a single additive bias shift, holding the fitted person, facet, step, and slope estimates fixed

The chi_sq element provides a fixed-effect heterogeneity screen across all interaction cells.

Recommended next step

Use plot_bias_interaction() to inspect the flagged cells visually, then integrate the result with DFF, linking, or substantive scoring review before making formal claims about fairness or invariance.

References

Linacre, J. M. (1989). Many-Facet Rasch Measurement. MESA Press. (FACETS Table 13 corresponds to the bias / interaction estimation that this helper implements.)
Eckes, T. (2005). Examining rater effects in TestDaF writing and speaking performance assessments: A many-facet Rasch analysis. Language Assessment Quarterly, 2(3), 197-221.
Eckes, T. (2015). Introduction to many-facet Rasch measurement: Analyzing and evaluating rater-mediated assessments (2nd ed.). Peter Lang.
Myford, C. M., & Wolfe, E. W. (2003). Detecting and measuring rater effects using many-facet Rasch measurement: Part I. Journal of Applied Measurement, 4(4), 386-422.
Myford, C. M., & Wolfe, E. W. (2004). Detecting and measuring rater effects using many-facet Rasch measurement: Part II. Journal of Applied Measurement, 5(2), 189-227.

Examples

toy <- load_mfrmr_data("example_bias")
fit <- fit_mfrm(toy, "Person", c("Rater", "Criterion"), "Score", method = "JML", maxit = 30)
diag <- diagnose_mfrm(fit, residual_pca = "none")
bias <- estimate_bias(fit, diag, facet_a = "Rater", facet_b = "Criterion", max_iter = 2)
s_bias <- summary(bias)
s_bias$overview
# Look for: `MaxAbsBias` < ~0.5 logits and `Significant = 0` mean
#   no cell exceeded the screen. The `BonferroniSignificant` /
#   `HolmSignificant` columns count cells that survive multiple-
#   testing correction; both being 0 is a stronger "no bias"
#   signal than the raw screen-positive count alone.
s_bias$top_rows
# Look for: rows with `|t|` > 2 and |Bias Size| > 0.5 logits warrant
#   review (large effect AND statistically reliable). Rows with only
#   one of those triggered are usually small-cell artefacts.
p_bias <- plot_bias_interaction(bias, draw = FALSE)
p_bias$data$plot

mfrmr documentation built on June 13, 2026, 1:07 a.m.

mfrmr index

Package overview README.md GPCM scope and current limitations mfrmr Linking and DFF mfrmr Reporting and APA mfrmr Visual Diagnostics mfrmr Workflow Migrating from Facets to mfrmr MML estimation and marginal-fit diagnostics

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

mfrmr
Estimation and Diagnostics for Many-Facet Measurement Models

estimate_bias: Estimate bias and interaction screening terms
In mfrmr: Estimation and Diagnostics for Many-Facet Measurement Models

Estimate bias and interaction screening terms

Description

Usage

Arguments

Details

Value

What this screening means

What this screening does not justify

Interpreting output

Typical workflow

Interpreting key output columns

Recommended next step

References

See Also

Examples

Related to estimate_bias in mfrmr...

R Package Documentation

Browse R Packages

We want your feedback!

mfrmr Estimation and Diagnostics for Many-Facet Measurement Models

estimate_bias: Estimate bias and interaction screening terms In mfrmr: Estimation and Diagnostics for Many-Facet Measurement Models

Estimate bias and interaction screening terms

Description

Usage

Arguments

Details

Value

What this screening means

What this screening does not justify

Interpreting output

Typical workflow

Interpreting key output columns

Recommended next step

References

See Also

Examples

Related to estimate_bias in mfrmr...

R Package Documentation

Browse R Packages

We want your feedback!

mfrmr
Estimation and Diagnostics for Many-Facet Measurement Models

estimate_bias: Estimate bias and interaction screening terms
In mfrmr: Estimation and Diagnostics for Many-Facet Measurement Models