mice.impute.truescore: Custom mice imputation function for true score imputation

View source: R/core.R

mice.impute.truescoreR Documentation

Custom mice imputation function for true score imputation

Description

This custom imputation function is used with the mice package by setting method='truescore' for each variable imputed using true score imputation, which will call this custom imputation function mice.impute.truescore. Although possible, this function is not meant to be run on its own; see documentation for other mice imputation files, e.g., mice.impute.pmm, for details on this usage. Example usage through the mice package is provided in Examples below.

Usage

mice.impute.truescore(y, ry, x, wy = NULL, calibration = NULL, ...)

Arguments

y

Vector to be imputed

ry

Logical vector of length length(y) indicating the the subset y[ry] of elements in y to which the imputation model is fitted. The ry generally distinguishes the observed (TRUE) and missing values (FALSE) in y.

x

Numeric design matrix with length(y) rows with predictors for y. Matrix x may have no missing values.

wy

Logical vector of length length(y). A TRUE value indicates locations in y for which imputations are created.

calibration

A list of calibration information used for true score imputation. See below for details.

...

Other named arguments.

Value

Vector with imputed data, same type as y, and of length sum(wy)

Passing Calibration Information to mice

The calibration parameter is passed to the mice function using the blots input. For each imputed true score, provide the calibration information as a named list. The following elements are required, in any order:

os_name

Name of the variable in the data set containing the observed scores used for true score imputation

score_type

Type of score provided. Current options are 'CTT', corresponding to the classical test theory model of reliability; 'EAP', corresponding to expected a posteriori scoring in item response theory; and 'ML', corresponding to maximum likelihood scoring in item response theory. Each score_type requires specific other elements to be provided in calibration data; see below for these conditional elements.

mean

The mean of the score metric from calibration. For example, T scores are calibrated to a mean of 50, so if T scores are used, mean should be set to 50.

var_ts

The variance of the score metric from calibration. For example, T scores are calibrated to a standard deviation of 10, so if T scores are used, var_ts should be set to 100.

In addition, each score_type requires specific other elements to be provided in calibration data:

se_name

Required if score_type == 'EAP' or score_type == 'ML'. Name of the variable in the data set containing the standard error estimates of the observed scores provided in os_name.

reliability

Required if score_type == 'CTT'. Reliability estimate denoting the ratio of true score to observed score variance, as estimated from calibration.

Specifying the Predictor Matrix

Based on (unpublished) simulation results, it seems the best way to specify the predictor matrix for use in mice is for true scores to be predicted from all observed variables but not to predict other missing data from the imputed true scores. This is the default behavior when the TSI function is used, and we recommend, unless further research identifies otherwise, that the same be done when using this function to interact with mice directly.

Examples

##############
# CTT SCORES #
#add empty (NA) variables to data set to store true scores
data_ctt_2=data_ctt
data_ctt_2$TRUE_w=NA

#true score imputation
set.seed(0)
mice.data=mice(data_ctt_2,m=5,
  blocks=list('TRUE_w'),
  method='truescore',
  calibration=list(os_name='w',
                   score_type='CTT',
                   reliability=0.6,
                   mean_ts=0,
                   var_ts=1),
  predictorMatrix=matrix(c(1,1,0),1,3,byrow=T),
  printFlag=F,
  remove.constant=F)
mice.data

#analyze with imputed true scores
pool(with(mice.data,lm(TRUE_w~y)))

#compare standard deviations of observed and imputed true scores
mice.data=complete(mice.data,'all')
sds=sapply(mice.data,function(d)apply(d,2,sd))
apply(sds,1,mean)

##############
# EAP SCORES #
#add empty (NA) variables to data set to store true scores
data_eap_2=data_eap
data_eap_2$Tx=NA
data_eap_2$Ty=NA

#true score imputation
set.seed(0)
mice.data=mice(data_eap_2,m=5,maxit=5,
  method=c('pmm','pmm','pmm','pmm','pmm',
           'truescore','truescore'),
  blocks=list(Fx="Fx",Fy="Fy",SE.Fx="SE.Fx",SE.Fy="SE.Fy",m="m",
              Tx='Tx',Ty='Ty'),
  blots=list(Tx=list(calibration=list(os_name='Fx',
                                      se_name='SE.Fx',
                                      score_type='EAP',
                                      mean=50,
                                      var_ts=100,
                                      separated=T)),
             Ty=list(calibration=list(os_name='Fy',
                                      se_name="SE.Fy",
                                      score_type='EAP',
                                      mean=50,
                                      var_ts=100,
                                      separated=T))),
  predictorMatrix=matrix(c(0,1,1,1,1,0,0,
                           1,0,1,1,1,0,0,
                           1,1,0,1,1,0,0,
                           1,1,1,0,1,0,0,
                           1,1,1,1,0,0,0,
                           1,1,1,1,1,0,0,
                           1,1,1,1,1,0,0),7,7,byrow=T),
  printFlag=F,
  remove.constant=F)
mice.data

#multiple regression with imputed true scores
pool(with(mice.data,lm(Ty~Tx+m)))

mmansolf/TSI documentation built on Aug. 29, 2023, 2:38 a.m.