View source: R/api-estimation.R
| describe_mfrm_data | R Documentation |
Summarize MFRM input data (TAM-style descriptive snapshot)
describe_mfrm_data(
data,
person,
facets,
score,
weight = NULL,
rating_min = NULL,
rating_max = NULL,
keep_original = FALSE,
missing_codes = NULL,
include_person_facet = FALSE,
include_agreement = TRUE,
rater_facet = NULL,
context_facets = NULL,
agreement_top_n = NULL
)
data |
A data.frame in long format (one row per rating event). |
person |
Column name for person IDs. |
facets |
Character vector of facet column names. |
score |
Column name for observed score. |
weight |
Optional weight/frequency column name. |
rating_min |
Optional minimum category value. Supply with
|
rating_max |
Optional maximum category value. Supply with
|
keep_original |
Keep original category values. Use this with
|
missing_codes |
Optional. |
include_person_facet |
If |
include_agreement |
If |
rater_facet |
Optional rater facet name used for agreement summaries.
If |
context_facets |
Optional facets used to define matched contexts for
agreement. If |
agreement_top_n |
Optional maximum number of agreement pair rows. |
This function provides a compact descriptive bundle similar to the
pre-fit summaries commonly checked in TAM workflows:
sample size, score distribution, per-facet coverage, and linkage counts.
psych::describe() is used for numeric descriptives of score and weight.
Key data-quality checks to perform before fitting:
Sparse categories: any score category with fewer than 10 weighted observations may produce unstable threshold estimates (Linacre, 2002). Consider collapsing adjacent categories.
Unlinked elements: if a facet level has zero overlap with one or
more levels of another facet, the design is disconnected and
parameters cannot be placed on a common scale. Check
linkage_summary for low connectivity.
Extreme scores: persons or facet levels with all-minimum or all-maximum scores yield infinite logit estimates under JML; they are handled via Bayesian shrinkage under MML.
A list of class mfrm_data_description with:
overview: one-row run-level summary
missing_by_column: missing counts in selected input columns
missing_rate_summary: per-column missingness rate summary
(one row per input column, with raw and proportion-of-N columns)
score_descriptives: output from psych::describe() for score
weight_descriptives: output from psych::describe() for weight
score_distribution: weighted and raw score frequencies over the prepared
score support. Unused boundary categories are retained when the rating
range was supplied explicitly; unused intermediate categories require
keep_original = TRUE.
facet_level_summary: per-level usage and score summaries
facet_crosstabs: pairwise observation-count crosstabs between
non-person facets (named list keyed "facetA__facetB"); used by
summary(ds)$design_links to flag sparse / disconnected
facet-pair coverage
linkage_summary: person-facet connectivity diagnostics
agreement: observed-score inter-rater agreement bundle
row_retention: row counts before and after preparation filters
preparation_notes: structured notes for row drops, ID trimming, and
design conditions detected during preparation
score_support: minimal prepared score-support metadata used by
summary(ds)$caveats
Recommended order:
overview: confirms sample size, facet count, and category span.
The MinWeightedN column shows the smallest weighted observation
count across all facet levels; values below 30 may lead to
unstable parameter estimates.
missing_by_column: identifies immediate data-quality risks.
Any non-zero count warrants investigation before fitting.
score_distribution: checks sparse/unused score categories.
Balanced usage across categories is ideal; heavily skewed
distributions may compress the measurement range.
facet_level_summary and linkage_summary: checks per-level
support and person-facet connectivity. Low linkage ratios
indicate sparse or disconnected design blocks.
agreement: optional observed inter-rater consistency summary
(exact agreement, correlation, mean differences per rater pair).
Run describe_mfrm_data() on long-format input.
Review summary(ds) and plot(ds, ...).
Resolve missingness/sparsity issues before fit_mfrm().
fit_mfrm(), review_mfrm_anchors()
toy <- load_mfrmr_data("example_core")
ds <- describe_mfrm_data(
data = toy,
person = "Person",
facets = c("Rater", "Criterion"),
score = "Score"
)
s_ds <- summary(ds)
s_ds$overview
p_ds <- plot(ds, draw = FALSE)
p_ds$data$plot
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.