Description Usage Arguments Details Value Warning Note Author(s) References See Also Examples
Imputation models for Missing Not At Random (MNAR) binary or continuous outcomes develloped in this package use sample selection models. It is necessary, inside the imputation model, to specify a selection (i.e. missing data mechanism) and an outcome equation. The previous could be the model of interest (i.e. the post-imputation analysis model).
MNARargument
adaptes mice()
arguments:
data
: Indicator of MNAR outcome missingness is included
method
: For the MNAR outcome (varMNAR
), MNAR imputation model is specified
predictorMatrix
is modified to include MNAR indicator of missingness in other variable imputation model
Finally two new arguments are provided: JointModelEq
, defining selection and outcome equation of the sample selection model; and control
only for internal use.
The procedure is the following:
Use generate_JointModelEq()
to construct an empty matrix of variable names allowing to specify selection and outcome equation
Fulfill the previous empty matrix adequately to selection and outcome equation specification of the sample selection model
Generate an object using MNARargument()
function
Include in the mice()
function the five arguments of the previous object generated by MNARargument()
1 | MNARargument(data, method = NULL, predictorMatrix = NULL, varMNAR, JointModelEq = NULL)
|
data |
The dataset used for classical |
method |
The |
predictorMatrix |
The |
varMNAR |
The name of MNAR outcome to be imputed. |
JointModelEq |
Matrix indicating variables included in selection and outcome equations of MNAR outcome imputation models. |
Be careful to not define the same selection and outcome equations for MNAR imputation models. A constraint of the sample selection model implies the inclusion of different sets of covariates, which may or not be nested in the selection equation and the outcome equation, to avoid collinearity issues. It has been recommended to include at least a supplementary variable in the selection equation. This variable should be known to be unlinked directly to the outcome.
data_mod |
Modified dataset including indicator of missingness for MNAR outcomes. Indicators of missingness are coded as "ind_" adding the name of MNAR outcomes. |
method |
Modified |
predictorMatrix |
Modified |
JointModelEq |
For internal use: Modified |
control |
For internal use: MNAR outcomes. |
This package is only validated for the imputation of MNAR outcome. However, it is implemented to impute several MNAR variables in the same process. Such implementation must be realised carefully.
For MNAR continuous outcome, the Heckman's one step estimator is selected by default. However, the two-step estimator is available using mice.impute.hecknorm2step()
. To use it, the method
argument should be modified before inclusion in the mice()
function.
Jacques-Emmanuel Galimard
Galimard, J.E., Chevret, S., Curis, E., and Resche-Rigon, M. (2018). Heckman imputation models for binary or continuous MNAR missing outcomes and MAR missing predictors. BMC Medical Research Methodology (In press). Galimard, J.-E., Chevret, S., Protopopescu, C., and Resche-Rigon, M. (2016) A multiple imputation approach for MNAR mechanisms compatible with Heckman's model. Statistics In Medicine, 35: 2907-2920. doi:10.1002/sim.6902.
mice
copulaSampleSel
SemiParBIV
hiv
selection
generate_JointModelEq
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 | require(GJRM)
require(mvtnorm)
require(pbivnorm)
require(sampleSelection)
# Import dataset with a suspected MNAR mechanism
data("hiv")
# We select only one region (lusuka) and 5 variables
lusuka <- hiv[hiv$region==5,c("hiv", "age", "marital", "condom", "smoke")]
# Categorical variables have to be recoded as factor
lusuka$hiv <- as.factor(lusuka$hiv)
#############################################
#### Missing data only on a binary outcome ##
#############################################
# Specify a selection (missing data mechanism) and an outcome equation (analyse model)
# Generate an empty matrix
JointModelEq <- generate_JointModelEq(data=lusuka,varMNAR = "hiv")
# Fill in with 1 for variable included in equations
JointModelEq[,"hiv_var_sel"] <- c(0,1,1,1,1)
JointModelEq[,"hiv_var_out"] <- c(0,1,1,1,0)
# Generation of argument for MNAR imputation model in "mice()" function
arg <- MNARargument(data=lusuka,varMNAR="hiv",JointModelEq=JointModelEq)
# Imputation using mice() function
# Values returned have to be included in the "mice()" function as argument:
imputation <- mice(data = arg$data_mod,
method = arg$method,
predictorMatrix = arg$predictorMatrix,
JointModelEq=arg$JointModelEq,
control=arg$control,
maxit=1,m=5)
# Because of missing data only on one variable, fix maxit=1
# Estimation on each imputed dataset and pooling
analysis <- with(imputation,glm(hiv~age+condom+marital,family=binomial(link="probit")))
result <- pool(analysis)
summary(result)
##########################################################
#### Missing data on a binary outcome and one covariate ##
##########################################################
# Generate missing values on the variable "condom"
# According to a MAR mechanism using a probit model
prob <- pnorm((35.5-lusuka$age)/10.74) # Depending on "age"
lusuka$condom[rbinom(nrow(lusuka),size=1, prob=prob)==0] <- NA
JointModelEq <- generate_JointModelEq(data=lusuka,varMNAR = c("hiv"))
JointModelEq[,"hiv_var_sel"] <- c(0,1,1,1,1)
JointModelEq[,"hiv_var_out"] <- c(0,1,1,1,0)
arg <- MNARargument(data=lusuka,varMNAR=c("hiv"),JointModelEq=JointModelEq)
## Not run: # Imputation using mice function
imputation <- mice(data = arg$data_mod,
method = arg$method,
predictorMatrix = arg$predictorMatrix,
JointModelEq=arg$JointModelEq,
control=arg$control,
maxit=5,m=5)
# As classically, estimation on each imputed datasets and pooling
analysis <- with(imputation,glm(hiv~age+condom+marital,family=binomial(link="probit")))
result <- pool(analysis)
summary(result)
## End(Not run)
#################################################
#### Missing data only on a continuous outcome ##
#################################################
# Generation of a simulated dataset with MNAR mechanism on a continuous outcome
X1 <- rnorm(500,0,1)
X2 <- rbinom(500,1,0.5)
X3 <- rnorm(500,1,0.5)
errors <- rmvnorm(500,mean=c(0,0),sigma=matrix(c(1,0.3,0.3,1),nrow=2,byrow=TRUE))
Y <- X1+X2+errors[,1]
Ry <- ifelse(0.66+1*X1-0.5*X2+X3+errors[,2]>0,1,0)
Y[Ry==0] <- NA
simul_data <- data.frame(Y,X1,X2,X3)
JointModelEq <- generate_JointModelEq(data=simul_data,varMNAR = "Y")
JointModelEq[,"Y_var_sel"] <- c(0,1,1,1)
JointModelEq[,"Y_var_out"] <- c(0,1,1,0)
arg <- MNARargument(data=simul_data,varMNAR="Y",JointModelEq=JointModelEq)
imputation2 <- mice(data = arg$data_mod,
method = arg$method,
predictorMatrix = arg$predictorMatrix,
JointModelEq=arg$JointModelEq,
control=arg$control,
maxit=1,m=5)
analysis2 <- with(imputation,lm(Y~X1+X2+X3))
result2 <- pool(analysis2)
summary(result2)
#############################
## Using 2-step estimation ##
#############################
arg <- MNARargument(data=simul_data,varMNAR="Y",JointModelEq=JointModelEq)
arg$method["Y"] <- "hecknorm2step"
## Not run: imputation3 <- mice(data = arg$data_mod,
method = arg$method,
predictorMatrix = arg$predictorMatrix,
JointModelEq=arg$JointModelEq,
control=arg$control,
maxit=1,m=5)
analysis3 <- with(imputation3,lm(Y~X1+X2+X3))
result3 <- pool(analysis3)
summary(result3)
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.