MNARargument: Function providing modified arguments for imputation of...

Description Usage Arguments Details Value Warning Note Author(s) References See Also Examples

View source: R/MNARargument.R

Description

Imputation models for Missing Not At Random (MNAR) binary or continuous outcomes develloped in this package use sample selection models. It is necessary, inside the imputation model, to specify a selection (i.e. missing data mechanism) and an outcome equation. The previous could be the model of interest (i.e. the post-imputation analysis model).

MNARargument adaptes mice() arguments:

  1. data: Indicator of MNAR outcome missingness is included

  2. method: For the MNAR outcome (varMNAR), MNAR imputation model is specified

  3. predictorMatrix is modified to include MNAR indicator of missingness in other variable imputation model

Finally two new arguments are provided: JointModelEq, defining selection and outcome equation of the sample selection model; and control only for internal use.

The procedure is the following:

  1. Use generate_JointModelEq() to construct an empty matrix of variable names allowing to specify selection and outcome equation

  2. Fulfill the previous empty matrix adequately to selection and outcome equation specification of the sample selection model

  3. Generate an object using MNARargument() function

  4. Include in the mice() function the five arguments of the previous object generated by MNARargument()

Usage

1
MNARargument(data, method = NULL, predictorMatrix = NULL, varMNAR, JointModelEq = NULL)

Arguments

data

The dataset used for classical mice() and additional variables necessary for MNAR imputation models.

method

The mice() method argument.

predictorMatrix

The mice() predictorMatrix argument.

varMNAR

The name of MNAR outcome to be imputed.

JointModelEq

Matrix indicating variables included in selection and outcome equations of MNAR outcome imputation models.

Details

Be careful to not define the same selection and outcome equations for MNAR imputation models. A constraint of the sample selection model implies the inclusion of different sets of covariates, which may or not be nested in the selection equation and the outcome equation, to avoid collinearity issues. It has been recommended to include at least a supplementary variable in the selection equation. This variable should be known to be unlinked directly to the outcome.

Value

data_mod

Modified dataset including indicator of missingness for MNAR outcomes. Indicators of missingness are coded as "ind_" adding the name of MNAR outcomes.

method

Modified mice() method argument using mice.impute.hecknorm() and mice.impute.heckprob() as imputation methods respectively for continuous and binary outcomes.

predictorMatrix

Modified mice() predictorMatrix argument including indicator of MNAR outcomes missingness as predictors for MAR covariates.

JointModelEq

For internal use: Modified JointModelEq entry argument.

control

For internal use: MNAR outcomes.

Warning

This package is only validated for the imputation of MNAR outcome. However, it is implemented to impute several MNAR variables in the same process. Such implementation must be realised carefully.

Note

For MNAR continuous outcome, the Heckman's one step estimator is selected by default. However, the two-step estimator is available using mice.impute.hecknorm2step(). To use it, the method argument should be modified before inclusion in the mice() function.

Author(s)

Jacques-Emmanuel Galimard

References

Galimard, J.E., Chevret, S., Curis, E., and Resche-Rigon, M. (2018). Heckman imputation models for binary or continuous MNAR missing outcomes and MAR missing predictors. BMC Medical Research Methodology (In press). Galimard, J.-E., Chevret, S., Protopopescu, C., and Resche-Rigon, M. (2016) A multiple imputation approach for MNAR mechanisms compatible with Heckman's model. Statistics In Medicine, 35: 2907-2920. doi:10.1002/sim.6902.

See Also

mice copulaSampleSel SemiParBIV hiv selection generate_JointModelEq

Examples

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
require(GJRM)
require(mvtnorm)
require(pbivnorm)
require(sampleSelection)

# Import dataset with a suspected MNAR mechanism
data("hiv") 

# We select only one region (lusuka) and 5 variables
lusuka <- hiv[hiv$region==5,c("hiv", "age", "marital", "condom", "smoke")]

# Categorical variables have to be recoded as factor
lusuka$hiv <- as.factor(lusuka$hiv)

#############################################
#### Missing data only on a binary outcome ##
#############################################

# Specify a selection (missing data mechanism) and an outcome equation (analyse model)

# Generate an empty matrix

JointModelEq <- generate_JointModelEq(data=lusuka,varMNAR = "hiv")

# Fill in with 1 for variable included in equations
JointModelEq[,"hiv_var_sel"] <- c(0,1,1,1,1)
JointModelEq[,"hiv_var_out"] <- c(0,1,1,1,0)

# Generation of argument for MNAR imputation model in "mice()" function
arg <- MNARargument(data=lusuka,varMNAR="hiv",JointModelEq=JointModelEq)

# Imputation using mice() function
# Values returned have to be included in the "mice()" function as argument:

imputation <- mice(data = arg$data_mod,
                 method = arg$method,
                 predictorMatrix = arg$predictorMatrix,
                 JointModelEq=arg$JointModelEq,
                 control=arg$control,
                 maxit=1,m=5)

# Because of missing data only on one variable, fix maxit=1

# Estimation on each imputed dataset and pooling               
analysis <- with(imputation,glm(hiv~age+condom+marital,family=binomial(link="probit")))
result <- pool(analysis)
summary(result)

##########################################################
#### Missing data on a binary outcome and one covariate ##
##########################################################

# Generate missing values on the variable "condom" 
# According to a MAR mechanism using a probit model
prob <- pnorm((35.5-lusuka$age)/10.74) # Depending on "age"
lusuka$condom[rbinom(nrow(lusuka),size=1, prob=prob)==0] <- NA

JointModelEq <- generate_JointModelEq(data=lusuka,varMNAR = c("hiv"))
JointModelEq[,"hiv_var_sel"] <- c(0,1,1,1,1)
JointModelEq[,"hiv_var_out"] <- c(0,1,1,1,0)

arg <- MNARargument(data=lusuka,varMNAR=c("hiv"),JointModelEq=JointModelEq)

## Not run: # Imputation using mice function
imputation <- mice(data = arg$data_mod,
                 method = arg$method,
                 predictorMatrix = arg$predictorMatrix,
                 JointModelEq=arg$JointModelEq,
                 control=arg$control, 
                 maxit=5,m=5)

# As classically, estimation on each imputed datasets and pooling               
analysis <- with(imputation,glm(hiv~age+condom+marital,family=binomial(link="probit")))
result <- pool(analysis)
summary(result)
## End(Not run)

#################################################
#### Missing data only on a continuous outcome ##
#################################################

# Generation of a simulated dataset with MNAR mechanism on a continuous outcome

X1 <- rnorm(500,0,1)
X2 <- rbinom(500,1,0.5)
X3 <- rnorm(500,1,0.5)
  
errors <- rmvnorm(500,mean=c(0,0),sigma=matrix(c(1,0.3,0.3,1),nrow=2,byrow=TRUE))

Y <- X1+X2+errors[,1]
Ry <- ifelse(0.66+1*X1-0.5*X2+X3+errors[,2]>0,1,0)

Y[Ry==0] <- NA
  
simul_data <- data.frame(Y,X1,X2,X3)

JointModelEq <- generate_JointModelEq(data=simul_data,varMNAR = "Y")

JointModelEq[,"Y_var_sel"] <- c(0,1,1,1)
JointModelEq[,"Y_var_out"] <- c(0,1,1,0)

arg <- MNARargument(data=simul_data,varMNAR="Y",JointModelEq=JointModelEq)

imputation2 <- mice(data = arg$data_mod,
                 method = arg$method,
                 predictorMatrix = arg$predictorMatrix,
                 JointModelEq=arg$JointModelEq,
                 control=arg$control,
                 maxit=1,m=5)

analysis2 <- with(imputation,lm(Y~X1+X2+X3))
result2 <- pool(analysis2)
summary(result2)

#############################
## Using 2-step estimation ##
#############################

arg <- MNARargument(data=simul_data,varMNAR="Y",JointModelEq=JointModelEq)
arg$method["Y"] <- "hecknorm2step"

## Not run: imputation3 <- mice(data = arg$data_mod,
                 method = arg$method,
                 predictorMatrix = arg$predictorMatrix,
                 JointModelEq=arg$JointModelEq,
                 control=arg$control,
                 maxit=1,m=5)

analysis3 <- with(imputation3,lm(Y~X1+X2+X3))
result3 <- pool(analysis3)
summary(result3)
## End(Not run)

miceMNAR documentation built on May 2, 2019, 8:31 a.m.