process_missing: Preprocess Data to Handle Missing Variables
In jeremyrcoyle/tmle3: The Extensible TMLE Framework

process_missing

R Documentation

Preprocess Data to Handle Missing Variables

Description

Process data to account for missingness in preparation for TMLE

Usage

process_missing(
  data,
  node_list,
  complete_nodes = c("A", "Y"),
  impute_nodes = NULL,
  max_p_missing = 0.5
)

Arguments

`data,`	`data.table`, containing the missing variables
`node_list,`	`list`, what variables comprise each node
`complete_nodes,`	`character vector`, nodes we must observe
`impute_nodes,`	`character vector`, nodes we will impute
`max_p_missing,`	`numeric`, what proportion of missing is tolerable? Beyond that, the variable will be dropped from the analysis

Details

Rows where there is missingness in any of the complete_nodes will be dropped. Then, missingness will be median-imputed for the variables in the impute_nodes. Indicator variables of missingness will be generated for these nodes.

Then covariates will be processed as follows: