TSI: True score imputation

View source: R/convenience.R

TSIR Documentation

True score imputation

Description

Conduct true score imputation on variables with psychometric error, optionally in concert with multiple imputation for missing data. This function calls the mice function in the package of the same name, using the custom imputation function mice.impute.truescore for imputation of mismeasured variables. Direct calls to mice can get complicated (see documentation of mice.impute.truescore for examples), so this function was created as a convenicene function to more easily generate those function calls.

Usage

TSI(
  data,
  os_names,
  score_types,
  se_names = NULL,
  metrics = NULL,
  mean = NULL,
  var_ts = NULL,
  reliability = NULL,
  separated = rep(T, length(os_names)),
  ts_names = paste0("true_", os_names),
  mice_args
)

Arguments

data

Data frame on which to conduct imputation. By default, columns with missing values which are numeric will be imputed with the pmm method from mice, columns with names in os_names will be imputed using true score imputation, and non-numeric columns will be ignored.

os_names

Character vector of names of variables in data on which to use true score imputation.

score_types

Character vector specifying psychometric model(s) used for true score imputation. Currently available options are 'CTT' for classical test theory, 'EAP' for item response theory with expected a posteriori scoring, and 'ML' for item response theory with maximum likelihood scoring (not recommended). The selected model should match how scores were generated, which requires some understanding of the scoring process; for instance, HealthMeasures instruments, which include PROMIS, NIH Toolbox, and NeuroQOL measures, use EAP scoring to generate T scores and therefore score_types='EAP' would be appropriate when using these T scores.

se_names

Required for score_types='EAP' or score_types='ML'. Character vector of names of variables in data set containing standard errors for score variables specified by os_names. One variable must be specified for each variable in os_names. Not required for score_types='CTT'.

metrics

Character vector of metrics of true scores for imputation. Available values are 'z' for z scores (mean 0, variance 1), 'T' for T scores (mean 50, variance 100), and 'standard' for standard (IQ metric) scores (mean 100, variance 225). Either metrics or both mean and var_ts below must be specified for each variable, with each element referring to the corresponding variable named in os_names.

mean

Numeric vector of means of true scores for imputation. Must be specified if metrics is not specified.

var_ts

Numeric vector of variances of true scores for imputation. Must be specified if metrics is not specified.

reliability

Required for score_types='CTT'. Numeric vector of reliability estimates, one for each observed score variable in os_names referring to the reliability of the corresponding variable named in os_names.

separated

Logical vector indicating whether, for variables imputed with score_types='EAP' or score_types='ML', true score imputation uses an average standard error (separated=F), which runs faster but doesn't account for differential measurement error of the observed scores for each respondent, or whether separate standard errors are used for each value of each observed score (separated=T), which runs slower but accounts for differential measurement error.

ts_names

Optional vector of names of true score variables which will be created. Each element of ts_names denotes the name of the variable which will be created by TSI based on observed scores from the corresponding element of os_names. The default value of NULL results in the prefix true_ being prepended to each element of os_names when generating the imputed true scores.

mice_args

Named list of additional arguments passed to mice

Examples

##############
# CTT SCORES #
mice.data=TSI(data_ctt,
              os_names='w',
              score_types='CTT',
              reliability=0.6,
              mean=0,
              var_ts=1,
              mice_args=list(m=5,printFlag=F))
mice.data

#analyze with imputed true scores
pool(with(mice.data,lm(true_w~y)))

#compare standard deviations of observed and imputed true scores
mice.data=complete(mice.data,'all')
sds=sapply(mice.data,function(d)apply(d,2,sd))
apply(sds,1,mean)

##############
# EAP SCORES #
set.seed(0)
mice.data=TSI(data_eap,
              os_names=c('Fx','Fy'),
              se_names=c('SE.Fx','SE.Fy'),
              metrics='T',
              score_types='EAP',
              separated=T,
              ts_names=c('Tx','Ty'),
              mice_args=c(m=5,maxit=5,printFlag=F))
mice.data

#multiple regression with imputed true scores
pool(with(mice.data,lm(Ty~Tx+m)))

mmansolf/TSI documentation built on Aug. 29, 2023, 2:38 a.m.