View source: R/sequential_imputation.R
| sequential_imputation | R Documentation |
Implements sequential imputation for missing covariates and outcomes in longitudinal data. The function uses a Bayesian non-parametric framework with mixed-effects models to handle both normal and non-normal random effects and errors. It sequentially imputes missing values by constructing univariate models in a fixed order, initializing with LOCF/NOCB, and ensuring consistency with a valid joint distribution.
sequential_imputation(
X,
Y,
Z = NULL,
subject_id,
type,
binary_outcome = FALSE,
model = c("BMTrees", "BMTrees_R", "BMTrees_RE", "mixedBART"),
outcome_model = c("BMTrees", "BMLM"),
nburn = 0L,
npost = 3L,
skip = 1L,
verbose = TRUE,
seed = NULL,
tol = 1e-20,
k = 2,
ntrees = 200,
reordering = TRUE,
pi_DP = 0.99
)
X |
A matrix of missing covariates. |
Y |
A vector of missing outcomes (numeric or logical). |
Z |
A matrix of complete random predictors. Default: |
subject_id |
A vector of subject IDs corresponding to the rows of |
type |
A vector indicating whether each covariate in |
binary_outcome |
A logical value indicating whether the outcome |
model |
A character vector specifying the imputation model for the covariates. Options are |
outcome_model |
A character vector specifying the model used for the outcome. Options are |
nburn |
An integer specifying the number of burn-in iterations. Default: |
npost |
An integer specifying the number of sampling iterations. Default: |
skip |
An integer specifying the interval for keeping samples in the sampling phase. Default: |
verbose |
A logical value indicating whether to display progress and MCMC information. Default: |
seed |
A random seed for reproducibility. Default: |
tol |
A small numerical tolerance to prevent numerical overflow or underflow in the model. Default: |
k |
A numeric value for the BART prior parameter controlling the standard deviation of the terminal node values. Default: |
ntrees |
An integer specifying the number of trees in BART. Default: |
reordering |
A logical value indicating whether to apply a reordering strategy for sorting covariates based on missingness. Default: |
pi_DP |
A value between 0 and 1 for calculating the empirical prior in the DP prior. Default: |
The function builds on the Bayesian Trees Mixed-Effects Model (BMTrees), which extends Mixed-Effects BART by using centralized Dirichlet Process Normal Mixture priors. This framework handles non-normal random effects and errors, addresses model misspecification, and captures complex relationships.
The algorithm initializes missing values using Last Observation Carried Forward (LOCF) and Next Observation Carried Backward (NOCB) before starting the MCMC sequential imputation process.
A list containing:
imputed_data |
A three-dimensional array of imputed data with dimensions |
posterior_sigma |
(Only if |
posterior_beta |
(Only if |
This function utilizes modified C++ code originally derived from the BART3 package (Bayesian Additive Regression Trees). The original package was developed by Rodney Sparapani and is licensed under GPL-2. Modifications were made by Jungang Zou, 2024.
For more information about the original BART3 package, see: https://github.com/rsparapa/bnptools/tree/master/BART3
data <- simulation_imputation(NNY = TRUE, NNX = TRUE, n_subject = 10, seed = 123)
BMTrees <- sequential_imputation(X = data$data_M[,3:5], Y = data$data_M$Y, Z = data$Z,
subject_id = data$data_M$subject_id, type = c(0, 0, 0),
outcome_model = "BMLM", binary_outcome = FALSE, model = "BMTrees", nburn = 0,
npost = 1, skip = 1, verbose = FALSE, seed = 123)
# Access imputed data
dim(BMTrees$imputed_data)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.