# mice.impute.plausible.values: Plausible Value Imputation using Classical Test Theory and...

Description Usage Arguments Details Value Note Author(s) References See Also Examples

### Description

This imputation function performs unidimensional plausible value imputation if (subject-wise) measurement errors or the reliability of the scale is known (Mislevy, 1991; see also Asparouhov & Muthen, 2010; Blackwell, Honaker & King, 2011, 2016a, 2016b). The function also allows the input of an individual likelihood obtained by fitting an item response model.

### Usage

 1 2 3 4 5 6 mice.impute.plausible.values(y, ry, x, type, alpha = NULL, alpha.se = 0, scale.values = NULL, sig.e.miss = 1e+06, like=NULL , theta=NULL , normal.approx=NULL , pviter = 15, imputationWeights = rep(1, length(y)), plausible.value.print = TRUE, pls.facs = NULL, interactions = NULL, quadratics = NULL, extract_data=TRUE, ...) 

### Arguments

 y Incomplete data vector of length n ry Vector of missing data pattern (FALSE – missing, TRUE – observed) x Matrix (n \times p) of complete covariates. type Type of predictor variables. type=3 refers to items belonging to a scale to be imputed. A cluster (grouping) variable is defined by type=-2. If for some predictors, the cluster means should also be included as predictors, then specify type=2 (see Imputation Model 3 of Example 1). alpha A known reliability estimate. An optional standard error of the estimate can be provided in alpha.se alpha.se Optional numeric value of the standard error of the alpha reliability estimate if in every iteration a new reliability should be sampled. scale.values A list consisting of scale values of scale values and its corresponding standard errors (see Example 1). sig.e.miss A standard error of measurement for cases with missing values on a scale like Individual likelihood evaluated at theta theta Grid of unidimensional latent variable normal.approx Logical indicating whether the individual posterior should be approximated by a normal distribution pviter Number of iterations in each imputation which should be run until the plausible values are drawn imputationWeights Optional vector of sample weights plausible.value.print An optional logical indicating whether some information about the plausible value imputation should be printed at the console pls.facs Number of PLS factors if PLS dimension reduction is used interactions Vector of variable names used for creating interactions quadratics Vector of variable names used for creating quadratic terms extract_data Logical indicating whether input data should be extracted from parent environment within mice::mice routine ... Further objects to be passed

### Details

The linear model is assumed for drawing plausible values of a variable Y contaminated by measurement error. Assuming Y= θ + e and a linear regression model for θ

θ = \bold{X} β + ε

(plausible value) imputations from the posterior distribution P( θ | Y , \bold{X} ) are drawn. See Mislevy (1991) for details.

### Value

A vector of length nrow(x) containing imputed plausible values.

### Note

Plausible value imputation is also known as multiple overimputation (Blackwell, Honaker & King, 2016a, 2016b) which is implemented in the Amelia package, see Amelia::moPrep and Amelia::amelia.

### Author(s)

Alexander Robitzsch

### References

Asparouhov, T., & Muthen, B. (2010). Plausible values for latent variables using Mplus. Technical Report. https://www.statmodel.com/papers.shtml

Blackwell, M., Honaker, J., & King, G. (2011). Multiple overimputation: A unified approach to measurement error and missing data. Technical Report.

Blackwell, M., Honaker, J., & King, G. (2016a). A unified approach to measurement error and missing data: Overview and applications. Sociological Methods & Research, xx, xxx-xxx.

Blackwell, M., Honaker, J., & King, G. (2016b). A unified approach to measurement error and missing data: Details and extensions. Sociological Methods & Research, xx, xxx-xxx.

Mislevy, R. J. (1991). Randomization-based inference about latent variables from complex samples. Psychometrika, 56, 177-196.

See TAM::tam.latreg for fitting latent regression models.

### Examples

  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 ## Not run: ############################################################################# # EXAMPLE 1: Plausible value imputation for data.ma04 | 2 scales ############################################################################# data(data.ma04) dat <- data.ma04 # Scale 1 consists of items A1,...,A4 # Scale 2 consists of items B1,...,B5 dat$scale1 <- NA dat$scale2 <- NA # empty imputation imp <- mice::mice( dat , m=0 , maxit=0 ) summary(imp) # define predictors predM <- imp$pred # define imputation methods impMethod <- imp$method impMethod <- rep( "norm" , length(impMethod) ) names(impMethod) <- names( imp$method ) # look at missing proportions colSums( is.na(dat) ) # redefine imputation methods for plausible value imputation impMethod[ "scale1" ] <- "plausible.values" predM[ "scale1" , ] <- 1 predM[ "scale1" , c("A1" , "A2" , "A3" , "A4" ) ] <- 3 # items corresponding to a scale should be declared by a 3 in the predictor matrix impMethod[ "scale2" ] <- "plausible.values" predM[ ,"scale2" ] <- 0 predM[ "scale2" , c("A2","A3","A4","V6","V7") ] <- 1 diag(predM) <- 0 # use imputed scale values as predictors for V5, V6 and V7 predM[ c("V5","V6","V7") , c("scale1","scale2" ) ] <- 1 # exclude for V5, V6 and V7 the items of scales A and B as predictors predM[ c("V5","V6","V7") , c( paste0("A",2:4) , paste0("B",1:5) ) ] <- 0 # exclude 'group' as a predictor predM[,"group"] <- 0 # look at imputation method and predictor matrix impMethod predM #------------------------------- # Parameter for imputation #*** # scale 1 (A1,...,A4) # known Cronbach's Alpha alpha <- NULL alpha <- list( "scale1" = .8 ) alpha.se <- list( "scale1" = .05 ) # sample alpha with a standard deviation of .05 #*** # scale 2 (B1,...,B5) # means and SE's of scale scores are assumed to be known M.scale2 <- rowMeans( dat[ , paste("B",1:5,sep="") ] ) # M.scale2[ is.na( m1) ] <- mean( M.scale2 , na.rm=TRUE ) SE.scale2 <- rep( sqrt( stats::var(M.scale2,na.rm=T)*(1-.8) ) , nrow(dat) ) # => heterogeneous measurement errors are allowed scale.values <- list( "scale2" = list( "M" = M.scale2 , "SE" = SE.scale2 ) ) #*** Imputation Model 1: Imputation four using parallel chains imp1 <- mice::mice( dat , predictorMatrix = predM , m = 4, maxit = 5 , alpha.se = alpha.se , imputationMethod = impMethod , allow.na = TRUE , alpha = alpha, scale.values = scale.values ) summary(imp1) # extract first imputed dataset dat11 <- mice::complete( imp , 1 ) #*** Imputation Model 2: Imputation using one long chain imp2 <- miceadds::mice.1chain( dat , predictorMatrix = predM , burnin=10 , iter=20 , Nimp=4 , alpha.se = alpha.se , imputationMethod = impMethod , allow.na = TRUE , alpha = alpha, scale.values = scale.values ) summary(imp2) #------------- #*** Imputation Model 3: Imputation including group level variables # use group indicator for plausible value estimation predM[ "scale1" , "group" ] <- -2 # V7 and B1 should be aggregated at the group level predM[ "scale1" , c("V7","B1") ] <- 2 predM[ "scale2" , "group" ] <- -2 predM[ "scale2" , c("V7","A1") ] <- 2 # perform single imputation (m=1) imp <- mice::mice( dat , predictorMatrix = predM , m = 1 , maxit=10 , imputationMethod = impMethod , allow.na = TRUE , alpha = alpha, scale.values = scale.values ) dat10 <- mice::complete(imp) # multilevel model library(lme4) mod <- lme4::lmer( scale1 ~ ( 1 | group) , data = dat11 ) summary(mod) mod <- lme4::lmer( scale1 ~ ( 1 | group) , data = dat10) summary(mod) ############################################################################# # SIMULATED EXAMPLE 2: Plausible value imputation with chained equations ############################################################################# # - simulate a latent variable theta and dichotomous item responses # - two covariates X in which the second covariate has measurement error library(sirt) library(TAM) library(lavaan) set.seed(7756) N <- 2000 # number of persons I <- 10 # number of items # simulate covariates X <- MASS::mvrnorm( N , mu=c(0,0) , Sigma = matrix( c(1,.5,.5,1) ,2 ,2 ) ) colnames(X) <- paste0("X",1:2) # second covariate with measurement error with variance var.err var.err <- .3 X.err <- X X.err[,2] <- X[,2] + stats::rnorm(N, sd = sqrt(var.err) ) # simulate theta theta <- .5*X[,1] + .4*X[,2] + stats::rnorm( N , sd = .5 ) # simulate item responses itemdiff <- seq( -2 , 2 , length=I) # item difficulties dat <- sirt::sim.raschtype( theta , b = itemdiff ) #*********************** #*** Model 0: Regression model with true variables mod0 <- stats::lm( theta ~ X ) summary(mod0) #********************** # plausible value imputation for abilities and error-prone # covariates using the mice package # creating the likelihood for plausible value for abilities mod11 <- TAM::tam.mml( dat ) likePV <- IRT.likelihood(mod11) # creating the likelihood for error-prone covariate X2 # The known measurement error variance is 0.3. lavmodel <- " X2true =~ 1*X2 X2 ~~ 0.3*X2 " mod12 <- lavaan::cfa( lavmodel , data = as.data.frame(X.err) ) summary(mod12) likeX2 <- IRTLikelihood.cfa( data= X.err , cfaobj=mod12) str(likeX2) #-- create data input for mice package data <- data.frame( "PVA" = NA , "X1" = X[,1] , "X2" = NA ) vars <- colnames(data) V <- length(vars) predictorMatrix <- 1 - diag(V) rownames(predictorMatrix) <- colnames(predictorMatrix) <- vars imputationMethod <- rep("norm" , V ) names(imputationMethod) <- vars imputationMethod[c("PVA","X2")] <- "plausible.values" #-- create argument lists for plausible value imputation # likelihood and theta grid of plausible value derived from IRT model like <- list( "PVA" = likePV , "X2" = likeX2 ) theta <- list( "PVA" = attr(likePV,"theta") , "X2" = attr(likeX2 , "theta") ) #-- initial imputations data.init <- data data.init$PVA <- mod11$person$EAP data.init\$X2 <- X.err[,"X2"] #-- imputation using the mice and miceadds package imp1 <- mice::mice( as.matrix(data) , predictorMatrix = predictorMatrix , m = 4, maxit = 6 , imputationMethod = imputationMethod , allow.na = TRUE , theta=theta , like=like , data.init=data.init ) summary(imp1) # compute linear regression mod4a <- with( imp1 , stats::lm( PVA ~ X1 + X2 ) ) summary( mice::pool(mod4a) ) ## End(Not run) 

Search within the miceadds package
Search all R packages, documentation and source code

Questions? Problems? Suggestions? or email at ian@mutexlabs.com.

Please suggest features or report bugs with the GitHub issue tracker.

All documentation is copyright its authors; we didn't write any of that.