abbreviate_scale: Abbreviate a scale
In Deleetdk/kirkegaard: kirkegaard

abbreviate_scale

R Documentation

Abbreviate a scale

Description

This function abbreviates a scale by iteratively adding or removing items that are the least useful for predicting a criterion variable. The function can use a max loading, backwards, forwards, or genetic algorithm method to select items.

Usage

abbreviate_scale(
  items,
  criterion_vars = NULL,
  item_target,
  method = "forwards",
  selection_method = "rc",
  mirt_args = NULL,
  save_fits = T,
  seed = 1,
  max_generations = 100,
  population_size = 100,
  mutation_rate = 0.1,
  selection_ratio = 0.2,
  stop_search_after_generations = 10,
  include_parents = T,
  difficulty_balance_groups = NULL,
  residualize_loadings = F,
  reliability_at = "total",
  IRT = T
)

Arguments

`items`	A data frame or matrix of items
`criterion_vars`	A data frame of criterion variables
`item_target`	The number of items to retain
`method`	The method to use for item selection. Options are "backwards", "forwards", "max_loading", or "genetic".
`selection_method`	The method to use for selecting items. Options are "rc" (average of correlation with criterion variable(s) and reliability), "r" (reliability), or "c" (correlation with criterion variable(s)).
`mirt_args`	A list of arguments to pass to the mirt function
`save_fits`	Whether to save the fits and scores for each item set. This might be useful for further analysis, but it also takes up memory.
`seed`	A seed to use for reproducibility. Default is 1.
`max_generations`	The maximum number of generations to use for the genetic algorithm. Default is 100.
`population_size`	The size of the population to use for the genetic algorithm. Default is 100.
`mutation_rate`	The mutation rate to use for the genetic algorithm. Default is 0.1.
`selection_ratio`	The ratio of the population to select for the next generation. Default is 0.20.
`stop_search_after_generations`	The number of generations to wait for no improvement before stopping the search. Default is 10.
`include_parents`	Whether to include the parents in the next generation. Default is TRUE.
`difficulty_balance_groups`	The number of groups to balance difficulty across. Default is NULL.
`residualize_loadings`	Whether to residualize loadings based on difficulty for selection purposes. Default is FALSE.
`reliability_at`	The reliability to use for selection. Default is "total".
`IRT`	Whether to use IRT or classical test theory. Default is TRUE. If false, reliability will be calculated as Cronbach's alpha, loadings will be calculated as biserial correlations, and the score will be calculated as the sum of the correct items. This makes it much faster to run but less accurate.

Value

A list of results. You probably want to call GG_scale_abbreviation() on these.

Examples

library(mirt)
#simulate some mirt data 2PL
set.seed(1)
dat = mirt::simdata(N = 1e3, itemtype = "2PL", a = runif(20, 0.5, 2), d = rnorm(20, sd = 0.5))
#fit the model
fit = mirt::mirt(dat, 1)
#scale abbreviation
short_scale = abbreviate_scale(as.data.frame(dat), method = "max_loading", item_target = 10)
#plot
GG_scale_abbreviation(short_scale)
#using CTT statistics instead
short_scale = abbreviate_scale(as.data.frame(dat), method = "max_loading", item_target = 10, IRT = F)

Deleetdk/kirkegaard documentation built on June 8, 2025, 4:09 a.m.