BMLMM_prediction: Bayesian Mixed Linear Models for Predicting Longitudinal...

View source: R/BMTrees_prediction.R

BMLMM_predictionR Documentation

Bayesian Mixed Linear Models for Predicting Longitudinal Outcomes with DP Priors

Description

Provides predictions for outcomes in longitudinal data using Bayesian Mixed Linear Models (BMLMM). Unlike the tree-based variant, this function assumes a linear relationship for fixed effects while maintaining the flexible centralized Dirichlet Process (DP) framework for random effects and residuals. It predicts values for test data while accounting for complex error structures.

Usage

BMLMM_prediction(
  X_train,
  Y_train,
  Z_train,
  subject_id_train,
  X_test,
  Z_test,
  subject_id_test,
  model = c("BMTrees", "BMTrees_R", "BMTrees_RE", "mixedBART"),
  binary = FALSE,
  nburn = 3000L,
  npost = 4000L,
  skip = 1L,
  verbose = TRUE,
  seed = NULL,
  tol = 1e-20,
  add_intercept = TRUE
)

Arguments

X_train

A matrix of covariates in the training set.

Y_train

A numeric or logical vector of outcomes in the training set.

Z_train

A matrix of random predictors in the training set.

subject_id_train

A character vector of subject IDs in the training set.

X_test

A matrix of covariates in the testing set.

Z_test

A matrix of random predictors in the testing set.

subject_id_test

A character vector of subject IDs in the testing set.

model

A character string specifying the distribution assumptions for residuals and random effects. Options are:

  • "BMTrees" (default): DP priors for both residuals and random effects.

  • "BMTrees_R": DP prior for residuals, Normal prior for random effects.

  • "BMTrees_RE": Normal prior for residuals, DP prior for random effects.

  • "mixedBART": Normal priors for both residuals and random effects.

binary

Logical. Indicates whether the outcome is binary (TRUE) or continuous (FALSE). Default: FALSE.

nburn

An integer specifying the number of burn-in iterations for the Gibbs sampler. Default: 3000L.

npost

An integer specifying the number of posterior samples to collect. Default: 4000L.

skip

An integer indicating the thinning interval for MCMC samples. Default: 1L.

verbose

Logical. If TRUE, displays MCMC progress. If FALSE, shows a progress bar. Default: TRUE.

seed

An optional integer for setting the random seed to ensure reproducibility. Default: NULL.

tol

A numeric tolerance value to prevent numerical overflow and underflow in the model. Default: 1e-20.

add_intercept

Logical. If TRUE, adds a column of ones (intercept) to the covariate matrices X_train and X_test. Default: TRUE.

Value

A list containing posterior samples and predictions:

post_beta

Posterior samples of the regression coefficients (fixed effects).

post_lmm_train

Posterior samples of the fixed-effects predictions (X \beta) on training data.

post_Sigma

Posterior samples of covariance matrices in random effects.

post_lambda_G

Posterior samples of lambda parameter in DP normal mixture on random errors.

post_lambda_F

Posterior samples of lambda parameter in DP normal mixture on random-effects.

post_B

Posterior samples of the coefficients in random effects.

post_random_effect_train

Posterior samples of random effects for training data.

post_sigma

Posterior samples of error deviation.

post_expectation_y_train

Posterior expectations of training data outcomes, equal to fixed-effects + random effects.

post_expectation_y_test

Posterior expectations of testing data outcomes, equal to fixed-effects + random effects.

post_predictive_y_train

Posterior predictive distributions for training outcomes, equal to fixed-effects + random effects + predictive residual.

post_predictive_y_test

Posterior predictive distributions for testing outcomes, equal to fixed-effects + random effects + predictive residual.

post_eta

Posterior samples of location parameters in DP normal mixture on random errors.

post_mu

Posterior samples of location parameters in DP normal mixture on random effects.

Note

This function utilizes modified C++ code originally derived from the BART3 package (Bayesian Additive Regression Trees). The original package was developed by Rodney Sparapani and is licensed under GPL-2. Modifications were made by Jungang Zou, 2024.

References

For more information about the original BART3 package, see: https://github.com/rsparapa/bnptools/tree/master/BART3

Examples

data <- simulation_prediction_conti(
   train_prop = 0.7,
   n_subject = 20,
   seed = 1,
   nonlinear = FALSE,
   residual = "normal",
   randeff = "MVN"
)
model <- BMLMM_prediction(
   X_train = data$X_train,
   Y_train = data$Y_train,
   Z_train = data$Z_train,
   subject_id_train = data$subject_id_train,
   X_test = data$X_test,
   Z_test = data$Z_test,
   subject_id_test = data$subject_id_test,
   model = "BMTrees",
   binary = FALSE,
   nburn = 0L, npost = 1L, skip = 1L, verbose = FALSE, seed = 1
)

SBMTrees documentation built on Feb. 6, 2026, 5:08 p.m.