| PSPI_generalizability | R Documentation |
This is the main function of the PSPI package. It runs Bayesian models that generalize findings from a clinical trial to a target population, estimating the average treatment effects and potential outcomes. Propensity scores of trial participation play the central role for generalizability analysis. When covariate shift is an issue, we recommend PSPI-SplineBART and PSPI-DSplineBART, which leveraging Bayesian Additive Regression Trees (BART) to model high-dimensional covariates, and propensity scores based splines to extrapolate smoothly.
Users provide trial data (covariates, outcomes, treatment, and propensity scores) along with population-level covariates and propensity scores. Propensity scores can be the true values or estimated from some models. The function then performs Monte Carlo Markov chain (MCMC) for the posterior inference.
PSPI_generalizability(
X,
Y,
A,
pi,
X_pop,
pi_pop,
model,
transformation = "InvGumbel",
nburn = 4000,
npost = 4000,
n_knots_main = NULL,
n_knots_inter = NULL,
order_main = 3,
order_inter = 3,
ntrees_s = 200,
verbose = FALSE,
seed = NULL
)
X |
Matrix of covariates for the trial data. |
Y |
Numeric vector of observed outcomes in the trial. |
A |
Binary vector of treatment assignments (0 = control, 1 = intervention). |
pi |
Numeric vector of trial propensity scores (probability of trial participation). |
X_pop |
Matrix of covariates for the target population data. |
pi_pop |
Numeric vector of the target population propensity scores. |
model |
Character string specifying which PSPI model to use (see Details). |
transformation |
Character string indicating the transformation applied to the
propensity scores. Options are |
nburn |
Number of burn-in iterations (default = 4000). |
npost |
Number of posterior iterations saved after burn-in (default = 4000). |
n_knots_main, n_knots_inter |
Number of spline knots for main and interaction effects.
If |
order_main, order_inter |
Order of spline basis functions (default = 3).
|
ntrees_s |
Number of trees used for the BART component (default = 200). |
verbose |
Logical; if TRUE, prints progress messages. |
seed |
Optional random seed for reproducibility. |
Model choices
The model argument selects the type of PSPI model to be fitted:
"BCF" – Bayesian Causal Forests (Hahn et al., 2020).
"BCF_P" – BCF with the propensity score as an additional predictor.
"FullBART" – Uses three BARTs to estimate treatment effects.
"SplineBART" – Incorporates a natural cubic spline for heterogeneous treatment effects.
"DSplineBART" – Adds another natural cubic spline for the prognostic score.
Propensity score transformations
Since splines are sensitive to scales of predictor, robust transformation is needed.
The propensity scores (pi for trial, pi_pop for population) can be
optionally transformed before modeling using one of the following:
"Identity" – uses the raw propensity scores directly (no transformation).
"Logit" – applies the logit transform: g(p) = \log(p / (1 - p)).
"Cloglog" – complementary log–log transform: g(p) = \log(-\log(1 - p)).
"InvGumbel" – inverse Gumbel transform: g(p) = -\log(-\log(p)). Default choice.
Users can experiment with different transformations to assess model sensitivity.
Spline settings
Spline-based models ("SplineBART" and "DSplineBART") allow flexible
extrapolation to address covariate shift. The number and order of spline basis functions can be
customized through the following parameters:
n_knots_inter, order_inter: number and order of spline knots for
treatment-interaction effects. Available for both SplineBART and
DSplineBART.
n_knots_main, order_main: number and order of spline knots for
main effects. Available only for DSplineBART.
If any of these are left as NULL, default values are chosen automatically based
on the cube root of the sample size (ensuring a reasonable smoothness level).
A list containing posterior samples and model summaries produced by the C++ sampler. Typical elements include:
Each row is a posterior draw for individual potential outcome under treatment
Each row is a posterior draw for individual potential outcome under control
Each row is a posterior draw for individual treatment effects
This function utilizes modified C++ code originally derived from the BART3 package (Bayesian Additive Regression Trees). The original package was developed by Rodney Sparapani and is licensed under GPL-2. Modifications were made by Jungang Zou, 2024. For more information about the original BART3 package, see: https://github.com/rsparapa/bnptools/tree/master/BART3
# Example with simulated data
sim <- sim_data(scenario = "linear", n_trial = 60)
fit <- PSPI_generalizability(
X = as.matrix(sim$trials[, paste0("X", 1:10)]),
Y = sim$trials$Y,
A = sim$trials$A,
pi = sim$population$ps[sim$population$selected],
X_pop = as.matrix(sim$population[, paste0("X", 1:10)]),
pi_pop = sim$population$ps,
model = "SplineBART",
transformation = "InvGumbel",
verbose = FALSE,
nburn = 1, npost = 1
)
str(fit)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.