simulate_missing: Simulate different mechanisms of missingness

View source: R/prepareData.R

simulate_missingR Documentation

Simulate different mechanisms of missingness

Description

Simulate different mechanisms of missingness

Usage

simulate_missing(
  data,
  miss_cols,
  ref_cols,
  pi,
  phis,
  phi_z,
  scheme,
  mechanism,
  sim_index = 1,
  fmodel = "S",
  Z = NULL
)

Arguments

data

data frame of data (N x P)

miss_cols

Columns to impose missingness on

ref_cols

Column(s) to use as covariates of missingness for MAR or MNAR. If scheme="UV" and mechanism="MAR", then each element of ref_cols is used as a covariate for each corresponding element of miss_cols (length of miss_cols must be smaller than ref_cols)

pi

Proportion of entries that are missing (for all miss_cols)

phis

Coefficients of each covariate in missingness model. Corresponding element is coef of corr element in miss_cols if MNAR, and in ref_cols if MAR (for scheme=UV)

phi_z

Coefficient of Z in missingness model, only used if fmodel="PM"

scheme

"UV" or "MV": "UV" uses one covariate in missingness model of each miss_cols element (mechanism = "MAR" uses corresponding element of ref_cols for each corresponding element of miss_cols, and mechanism = "MNAR" uses itself for miss model for each miss_cols). "MV" uses all ref_cols of each missingness model

mechanism

"MCAR", "MAR", or "MNAR" missingness. Only pertinent for fmodel="S".

sim_index

Index of simulation run: varies seed based on sim_index and column index for each element of miss_cols for reproducibility.

fmodel

"S" or "PM" for selection model or pattern-mixture model. If "PM", mechanism will be MNAR regardless of specification, and latent matrix Z must be specified and will be used as the only covariate of all missingness models

Z

Matrix of simulated values of the latent variable

Value

list of objects: Missing (N x P mask matrix), probs (N x P matrix of probabilities of each entry being missing), params (pertaining to simulations), mechanism (of missingness), scheme (univariate or multivariate logistic regression model), phi0s (intercepts for each missingness model), phis (coefficients for each covariate of each missingness model), phi_z (coefficient of Z, if fmodel="PM"))

Examples

data = read_data("BANKNOTE",NULL,1)$data
set.seed(111)
ref_cols=sample(c(1:ncol(data)),ceiling(ncol(data)/2),replace=F); miss_cols=(1:ncol(data))[-ref_cols]
simulate_missing(data, miss_cols, ref_cols, 0.5, rep(5,length(miss_cols)), NULL, "UV", "MNAR")

DavidKLim/NIMIWAE documentation built on Jan. 19, 2024, 11:18 p.m.