fscores: Compute factor score estimates (a.k.a, ability estimates,...

View source: R/fscores.R

fscoresR Documentation

Compute factor score estimates (a.k.a, ability estimates, latent trait estimates, etc)

Description

Computes MAP, EAP, ML (Embretson & Reise, 2000), EAP for sum-scores (Thissen et al., 1995), or WLE (Warm, 1989) factor scores with a multivariate normal prior distribution using equally spaced quadrature. EAP scores for models with more than three factors are generally not recommended since the integration grid becomes very large, resulting in slower estimation and less precision if the quadpts are too low. Therefore, MAP scores should be used instead of EAP scores for higher dimensional models. Multiple imputation variants are possible for each estimator if a parameter information matrix was computed, which are useful if the sample size/number of items were small. As well, if the model contained latent regression predictors this information will be used in computing MAP and EAP estimates (for these models, full.scores=TRUE will always be used). Finally, plausible value imputation is also available, and will also account for latent regression predictor effects.

Usage

fscores(
  object,
  method = "EAP",
  full.scores = TRUE,
  rotate = "oblimin",
  Target = NULL,
  response.pattern = NULL,
  append_response.pattern = FALSE,
  na.rm = FALSE,
  plausible.draws = 0,
  plausible.type = "normal",
  quadpts = NULL,
  item_weights = rep(1, extract.mirt(object, "nitems")),
  returnER = FALSE,
  T_as_X = FALSE,
  return.acov = FALSE,
  mean = NULL,
  cov = NULL,
  covdata = NULL,
  verbose = TRUE,
  full.scores.SE = FALSE,
  theta_lim = c(-6, 6),
  MI = 0,
  use_dentype_estimate = FALSE,
  QMC = FALSE,
  custom_den = NULL,
  custom_theta = NULL,
  min_expected = 1,
  max_theta = 20,
  start = NULL,
  ...
)

Arguments

object

a computed model object of class SingleGroupClass, MultipleGroupClass, or DiscreteClass

method

type of factor score estimation method. Can be:

  • "EAP" for the expected a-posteriori (default). For models fit using mdirt this will return the posterior classification probabilities

  • "MAP" for the maximum a-posteriori (i.e, Bayes modal)

  • "ML" for maximum likelihood

  • "WLE" for weighted likelihood estimation

  • "EAPsum" for the expected a-posteriori for each sum score

  • "plausible" for a single plausible value imputation for each case. This is equivalent to setting plausible.draws = 1

  • "classify" for the posteriori classification probabilities (only applicable when the input model was of class MixtureClass)

full.scores

if FALSE then a summary table with factor scores for each unique pattern is displayed as a formatted matrix object. Otherwise, a matrix of factor scores for each response pattern in the data is returned (default)

rotate

prior rotation to be used when estimating the factor scores. See summary-method for details. If the object is not an exploratory model then this argument is ignored

Target

target rotation; see summary-method for details

response.pattern

an optional argument used to calculate the factor scores and standard errors for a given response vector or matrix/data.frame

append_response.pattern

logical; should the inputs from response.pattern also be appended to the factor score output?

na.rm

logical; remove rows with any missing values? This is generally not required due to the nature of computing factors scores, however for the "EAPsum" method this may be necessary to ensure that the sum-scores correspond to the same composite score

plausible.draws

number of plausible values to draw for future researchers to perform secondary analyses of the latent trait scores. Typically used in conjunction with latent regression predictors (see mirt for details), but can also be generated when no predictor variables were modelled. If plausible.draws is greater than 0 a list of plausible values will be returned

plausible.type

type of plausible values to obtain. Can be either 'normal' (default) to use a normal approximation based on the ACOV matrix, or 'MH' to obtain Metropolis-Hastings samples from the posterior (silently passes object to mirt, therefore arguments like technical can be supplied to increase the number of burn-in draws and discarded samples)

quadpts

number of quadrature to use per dimension. If not specified, a suitable one will be created which decreases as the number of dimensions increases (and therefore for estimates such as EAP, will be less accurate). This is determined from the switch statement quadpts <- switch(as.character(nfact), '1'=121, '2'=61, '3'=31, '4'=19, '5'=11, '6'=7, 5)

item_weights

a user-defined weight vector used in the likelihood expressions to add more/less weight for a given observed response. Default is a vector of 1's, indicating that all the items receive the same weight

returnER

logical; return empirical reliability (also known as marginal reliability) estimates as a numeric values?

T_as_X

logical; should the observed variance be equal to var(X) = var(T) + E(E^2) or var(X) = var(T) when computing empirical reliability estimates? Default (FALSE) uses the former

return.acov

logical; return a list containing covariance matrices instead of factors scores? impute = TRUE not supported with this option

mean

a vector for custom latent variable means. If NULL, the default for 'group' values from the computed mirt object will be used

cov

a custom matrix of the latent variable covariance matrix. If NULL, the default for 'group' values from the computed mirt object will be used

covdata

when latent regression model has been fitted, and the response.pattern input is used to score individuals, then this argument is used to include the latent regression covariate terms for each row vector supplied to response.pattern

verbose

logical; print verbose output messages?

full.scores.SE

logical; when full.scores == TRUE, also return the standard errors associated with each respondent? Default is FALSE

theta_lim

lower and upper range to evaluate latent trait integral for each dimension. If omitted, a range will be generated automatically based on the number of dimensions

MI

a number indicating how many multiple imputation draws to perform. Default is 0, indicating that no MI draws will be performed

use_dentype_estimate

logical; if the density of the latent trait was estimated in the model (e.g., via Davidian curves or empirical histograms), should this information be used to compute the latent trait estimates? Only applicable for EAP-based estimates (EAP, EAPsum, and plausible)

QMC

logical; use quasi-Monte Carlo integration? If quadpts is omitted the default number of nodes is 5000

custom_den

a function used to define the integration density (if required). The NULL default assumes that the multivariate normal distribution with the 'GroupPars' hyper-parameters are used. At the minimum must be of the form:

function(Theta, ...)

where Theta is a matrix of latent trait values (will be a grid of values if method == 'EAPsum' or method == 'EAP', otherwise Theta will have only 1 row). Additional arguments may included and are caught through the fscores(...) input. The function must return a numeric vector of density weights (one for each row in Theta)

custom_theta

a matrix of custom integration nodes to use instead of the default, where each column corresponds to the respective dimension in the model

min_expected

when computing goodness of fit tests when method = 'EAPsum', this value is used to collapse across the conditioned total scores until the expected values are greater than this value. Note that this only affect the goodness of fit tests and not the returned EAP for sum scores table

max_theta

the maximum/minimum value any given factor score estimate will achieve using any modal estimator method (e.g., MAP, WLE, ML)

start

a matrix of starting values to use for iterative estimation methods. Default will start at a vector of 0's for each response pattern, or will start at the EAP estimates (unidimensional models only). Must be in the form that matches full.scores = FALSE (mostly used in the mirtCAT package)

...

additional arguments to be passed to nlm

Details

The function will return either a table with the computed scores and standard errors, the original data matrix with scores appended to the rightmost column, or the scores only. By default the latent means and covariances are determined from the estimated object, though these can be overwritten. Iterative estimation methods can be estimated in parallel to decrease estimation times if a mirtCluster object is available.

If the input object is a discrete latent class object estimated from mdirt then the returned results will be with respect to the posterior classification for each individual. The method inputs for 'DiscreteClass' objects may only be 'EAP', for posterior classification of each response pattern, or 'EAPsum' for posterior classification based on the raw sum-score. For more information on these algorithms refer to the mirtCAT package and the associated JSS paper (Chalmers, 2016).

Author(s)

Phil Chalmers rphilip.chalmers@gmail.com

References

Chalmers, R., P. (2012). mirt: A Multidimensional Item Response Theory Package for the R Environment. Journal of Statistical Software, 48(6), 1-29. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.18637/jss.v048.i06")}

Chalmers, R. P. (2016). Generating Adaptive and Non-Adaptive Test Interfaces for Multidimensional Item Response Theory Applications. Journal of Statistical Software, 71(5), 1-39. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.18637/jss.v071.i05")}

Embretson, S. E. & Reise, S. P. (2000). Item Response Theory for Psychologists. Erlbaum.

Thissen, D., Pommerich, M., Billeaud, K., & Williams, V. S. L. (1995). Item Response Theory for Scores on Tests Including Polytomous Items with Ordered Responses. Applied Psychological Measurement, 19, 39-49.

Warm, T. A. (1989). Weighted likelihood estimation of ability in item response theory. Psychometrika, 54, 427-450.

See Also

averageMI

Examples


mod <- mirt(Science)
tabscores <- fscores(mod, full.scores = FALSE)
head(tabscores)

# convert scores into expected total score information with 95% CIs
E.total <- expected.test(mod, Theta=tabscores[,'F1'])
E.total_2.5 <- expected.test(mod, Theta=tabscores[,'F1'] +
                                       tabscores[,'SE_F1'] * qnorm(.05/2))
E.total_97.5 <- expected.test(mod, Theta=tabscores[,'F1'] +
                                       tabscores[,'SE_F1'] * qnorm(1-.05/2))

data.frame(Total_score=rowSums(tabscores[,1:4]),
           E.total, E.total_2.5, E.total_97.5) |> head()

## Not run: 
fullscores <- fscores(mod)
fullscores_with_SE <- fscores(mod, full.scores.SE=TRUE)
head(fullscores)
head(fullscores_with_SE)

# convert scores into expected total score information with 95% CIs
E.total <- expected.test(mod, Theta=fullscores[,'F1'])
E.total_2.5 <- expected.test(mod, Theta=fullscores_with_SE[,'F1'] +
                                 fullscores_with_SE[,'SE_F1'] * qnorm(.05/2))
E.total_97.5 <- expected.test(mod, Theta=fullscores_with_SE[,'F1'] +
                               fullscores_with_SE[,'SE_F1'] * qnorm(1-.05/2))

data.frame(Total_score=rowSums(Science),
           E.total, E.total_2.5, E.total_97.5) |> head()

# change method argument to use MAP estimates
fullscores <- fscores(mod, method='MAP')
head(fullscores)

# calculate MAP for a given response vector
fscores(mod, method='MAP', response.pattern = c(1,2,3,4))
# or matrix
fscores(mod, method='MAP', response.pattern = rbind(c(1,2,3,4), c(2,2,1,3)))

# return only the scores and their SEs
fscores(mod, method='MAP', response.pattern = c(1,2,3,4))

# use custom latent variable properties (diffuse prior for MAP is very close to ML)
fscores(mod, method='MAP', cov = matrix(1000), full.scores = FALSE)
fscores(mod, method='ML', full.scores = FALSE)

# EAPsum table of values based on total scores
(fs <- fscores(mod, method = 'EAPsum', full.scores = FALSE))

# convert expected counts back into marginal probability distribution
within(fs,
   `P(y)` <- expected / sum(observed))

# list of error VCOV matrices for EAPsum (works for other estimators as well)
acovs <- fscores(mod, method = 'EAPsum', full.scores = FALSE, return.acov = TRUE)
acovs

# WLE estimation, run in parallel using available cores
if(interactive()) mirtCluster()
head(fscores(mod, method='WLE', full.scores = FALSE))

# multiple imputation using 30 draws for EAP scores. Requires information matrix
mod <- mirt(Science, 1, SE=TRUE)
fs <- fscores(mod, MI = 30)
head(fs)

# plausible values for future work
pv <- fscores(mod, plausible.draws = 5)
lapply(pv, function(x) c(mean=mean(x), var=var(x), min=min(x), max=max(x)))

## define a custom_den function (*must* return a numeric vector).
#  EAP with a uniform prior between -3 and 3
fun <- function(Theta, ...) as.numeric(dunif(Theta, min = -3, max = 3))
head(fscores(mod, custom_den = fun))

# compare EAP estimators with same modified prior
fun <- function(Theta, ...) as.numeric(dnorm(Theta, mean=.5))
head(fscores(mod, custom_den = fun))
head(fscores(mod, method = 'EAP', mean=.5))

# custom MAP prior: standard truncated normal between 5 and -2
library(msm)
# need the :: scope for parallel to see the function (not require if no mirtCluster() defined)
fun <- function(Theta, ...) msm::dtnorm(Theta, mean = 0, sd = 1, lower = -2, upper = 5)
head(fscores(mod, custom_den = fun, method = 'MAP', full.scores = FALSE))


####################
# scoring via response.pattern input (with latent regression structure)
# simulate data
set.seed(1234)
N <- 1000

# covariates
X1 <- rnorm(N); X2 <- rnorm(N)
covdata <- data.frame(X1, X2)
Theta <- matrix(0.5 * X1 + -1 * X2 + rnorm(N, sd = 0.5))

# items and response data
a <- matrix(1, 20); d <- matrix(rnorm(20))
dat <- simdata(a, d, 1000, itemtype = '2PL', Theta=Theta)

# conditional model using X1 and X2 as predictors of Theta
mod <- mirt(dat, 1, 'Rasch', covdata=covdata, formula = ~ X1 + X2)
coef(mod, simplify=TRUE)

# all EAP estimates that include latent regression information
fs <- fscores(mod, full.scores.SE=TRUE)
head(fs)

# score only two response patterns
rp <- dat[1:2, ]
cd <- covdata[1:2, ]

fscores(mod, response.pattern=rp, covdata=cd)
fscores(mod, response.pattern=rp[2,], covdata=cd[2,]) # just one pattern


## End(Not run)

mirt documentation built on Sept. 11, 2024, 7:14 p.m.