knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.width = 6, fig.height = 4 )
PSsurvival implements propensity score methods for causal inference with time-to-event outcomes. The package provides two main functions:
surveff(): Estimates counterfactual survival functions and survival differencesmarCoxph(): Estimates marginal hazard ratios via weighted Cox regressionBoth functions support binary and multiple treatment groups, multiple weighting schemes, and propensity score trimming.
library(PSsurvival)
The package includes two simulated datasets:
data(simdata_bin) # Binary treatment (A vs B) data(simdata_multi) # Four treatment groups (A, B, C, D) # Binary treatment data head(simdata_bin) table(simdata_bin$Z) # Multiple treatment data table(simdata_multi$Z)
Both datasets contain:
X1, X2, X3: Continuous covariatesB1, B2: Binary covariatesZ: Treatment grouptime: Follow-up time (range 0-20)event: Event indicator (1 = event, 0 = censored)surveff()Data and model specification:
data: Data frame containing treatment, outcome, and covariatesps_formula: Propensity score model (treatment ~ covariates)censoring_formula: Censoring model (Surv(time, event) ~ covariates)eval_times: Time points for evaluation (default: all unique event times)contrast_matrix: Matrix for pairwise comparisons (multiple groups only; column names must match treatment levels)Weighting and trimming:
estimand: Target estimand ("ATE", "ATT", or "overlap")att_group: Target group for ATT (required if estimand = "ATT")trim: Trimming method ("symmetric" or "asymmetric", default: no trimming)delta: Threshold for symmetric trimmingalpha: Percentile for asymmetric trimmingVariance estimation:
censoring_method: Method for censoring adjustment ("weibull" or "cox")variance_method: Variance estimation ("analytical" or "bootstrap")boot_level: Bootstrap sampling level ("full" or "strata")B: Number of bootstrap iterationsparallel, mc.cores, seed: Parallel computing and reproducibilityresult_bin <- surveff( data = simdata_bin, ps_formula = Z ~ X1 + X2 + X3 + B1 + B2, censoring_formula = survival::Surv(time, event) ~ X1 + B1, estimand = "overlap", censoring_method = "weibull" ) print(result_bin, max.len = 3)
The ps_formula specifies the propensity score model (treatment ~ covariates). The censoring_formula models the censoring distribution within each treatment group.
For multiple groups, survival differences require a contrast_matrix specifying which comparisons to make. Each row represents one contrast with exactly two non-zero elements: 1 and -1. Column names must match treatment levels exactly.
# Define pairwise comparisons contrast_mat <- matrix( c(1, -1, 0, 0, # A vs B 1, 0, -1, 0, # A vs C 1, 0, 0, -1), # A vs D nrow = 3, byrow = TRUE ) colnames(contrast_mat) <- c("A", "B", "C", "D") # Must match treatment levels rownames(contrast_mat) <- c("A vs B", "A vs C", "A vs D") # row names are optional result_multi <- surveff( data = simdata_multi, ps_formula = Z ~ X1 + X2 + X3 + B1 + B2, censoring_formula = survival::Surv(time, event) ~ X1 + B1, estimand = "ATE", contrast_matrix = contrast_mat, censoring_method = "cox", variance_method = "bootstrap", B = 50, seed = 123 ) print(result_multi, max.len = 3)
Without contrast_matrix, only group-specific survival functions are estimated (no differences in survivals).
Three estimands are supported and estimated by corresponding weighting methods:
att_group) using IPW with target-group specific tilting# ATE (default) surveff(..., estimand = "ATE") # ATT targeting group B surveff(..., estimand = "ATT", att_group = "B") # Overlap weights surveff(..., estimand = "overlap")
Trimming excludes observations with extreme propensity scores. Not available with overlap weights (which already downweight extremes).
delta or above 1-delta for any treatment# Symmetric trimming surveff(..., estimand = "ATE", trim = "symmetric", delta = 0.1) # Asymmetric trimming surveff(..., estimand = "ATE", trim = "asymmetric", alpha = 0.05)
Two methods are available for estimating censoring distributions:
"weibull" (default): Parametric Weibull AFT model. Efficient when censoring follows Weibull distribution. Allows analytical variance for binary treatment."cox": Semi-parametric Cox proportional hazards model. More flexible but requires bootstrap variance.# Weibull censoring model surveff(..., censoring_method = "weibull") # Cox censoring model (requires bootstrap) surveff(..., censoring_method = "cox", variance_method = "bootstrap", B = 200)
No censoring adjustment: To skip censoring adjustment entirely (e.g., when there is no censoring), use censoring_formula = Surv(time, event) ~ 0. In this case, all censoring scores are set to 1.
Two variance estimation methods are available:
"analytical": M-estimation theory-based variance. Only available for binary treatment with Weibull censoring. Fast and efficient when applicable."bootstrap": Resamples the entire estimation pipeline. Required for Cox censoring or multiple treatment groups. More computationally intensive but broadly applicable.# Analytical variance (binary + weibull only) surveff(..., censoring_method = "weibull", variance_method = "analytical") # Bootstrap variance surveff(..., variance_method = "bootstrap", B = 200)
Bootstrap sampling level (boot_level): Controls how bootstrap samples are drawn.
"full" (default): Resamples from entire dataset. Standard choice for observational studies."strata": Resamples within treatment groups, preserving group sizes. Useful when treatment assignment follows a stratified or fixed-ratio design.# Stratified bootstrap surveff(..., variance_method = "bootstrap", boot_level = "strata", B = 200)
The summary() method provides two output styles:
Printed output (style = "prints", default): Displays formatted tables with confidence intervals. Customize with:
conf_level: Confidence level (default 0.95)max.len: Maximum rows to displayround.digits: Number of decimal placesProgrammatic output (style = "returns"): Returns a list of matrices for further processing:
survival_summary: List of survival estimates by groupdifference_summary: List of treatment effect estimatessummary(result_bin, conf_level = 0.95, max.len = 5)
For programmatic access:
summ <- summary(result_bin, style = "returns") names(summ) head(summ$survival_summary$A) head(summ$difference_summary$`B vs A`)
The plot() method produces survival curves (type = "surv") or treatment effect curves (type = "survdiff").
# Survival curves for all groups plot(result_multi, type = "surv", curve_width = 0.5)
# Treatment effect curves plot(result_multi, type = "survdiff", curve_width = 0.5)
Selecting specific strata: Use strata_to_plot to display a subset of groups or contrasts.
# Only groups A and C plot(result_multi, type = "surv", curve_width = 0.5, strata_to_plot = c("A", "C"))
# Only specific contrasts (names must match contrast_matrix rownames exactly) plot(result_multi, type = "survdiff", curve_width = 0.5, strata_to_plot = c("A vs B", "A vs D"))
Customization options:
strata_to_plot: Vector of strata names to displaystrata_colors: Custom colors for each stratum (length must match strata_to_plot)max_time: Maximum time on x-axisinclude_CI: Show confidence interval ribbons (TRUE/FALSE)conf_level: Confidence level for intervalscurve_width: Line width for curvesCI_alpha: Transparency of CI ribbons (0-1)legend_position: "right" or "bottom"legend_title, plot_title: Custom titlesplot(result_multi, type = "surv", curve_width = 0.5, strata_to_plot = c("A", "B"), strata_colors = c("steelblue", "coral"), max_time = 15, include_CI = TRUE, legend_position = "bottom", plot_title = "Survival by Treatment Group")
marCoxph()marCoxph() fits a weighted Cox model to estimate marginal hazard ratios.
Data and model specification:
data: Data frame containing treatment, outcome, and covariatesps_formula: Propensity score model (treatment ~ covariates)time_var: Name of time-to-event variable (character string)event_var: Name of event indicator variable (character string, 1=event, 0=censored)reference_level: Reference treatment level (MANDATORY)Weighting and trimming:
estimand: Target estimand ("ATE", "ATT", or "overlap")att_group: Target group for ATT (required if estimand = "ATT")trim: Trimming method ("symmetric" or "asymmetric", default: no trimming)delta: Threshold for symmetric trimmingalpha: Percentile for asymmetric trimmingVariance estimation:
variance_method: "bootstrap" (default) or "robust"boot_level: Bootstrap sampling level ("full" or "strata")B: Number of bootstrap iterationsparallel, mc.cores, seed: Parallel computing and reproducibilityrobust: Use robust variance in Cox model (default TRUE)hr_result <- marCoxph( data = simdata_bin, ps_formula = Z ~ X1 + X2 + X3 + B1 + B2, time_var = "time", event_var = "event", reference_level = "A", estimand = "overlap" ) print(hr_result)
The reference_level argument is mandatory and specifies the baseline group for hazard ratio comparisons.
hr_multi <- marCoxph( data = simdata_multi, ps_formula = Z ~ X1 + X2 + X3 + B1 + B2, time_var = "time", event_var = "event", reference_level = "A", estimand = "ATE", variance_method = "bootstrap", B = 50, seed = 456 ) print(hr_multi)
Same as surveff(): ATE, ATT, and overlap weights are supported. Trimming options (symmetric/asymmetric) are also available.
# ATE (default) marCoxph(..., estimand = "ATE") # ATT targeting specific group marCoxph(..., estimand = "ATT", att_group = "treated") # Overlap weights marCoxph(..., estimand = "overlap")
Propensity score trimming works identically to surveff():
# Symmetric trimming marCoxph(..., estimand = "ATE", trim = "symmetric", delta = 0.1) # Asymmetric trimming marCoxph(..., estimand = "ATE", trim = "asymmetric", alpha = 0.05)
Two variance estimation methods:
"bootstrap" (default): Resamples the entire analysis pipeline (propensity scores, weights, and Cox model)"robust": Uses sandwich variance from coxph() directly (faster, no resampling)# Bootstrap variance (default) marCoxph(..., variance_method = "bootstrap", B = 200) # Robust sandwich variance (no bootstrap) marCoxph(..., variance_method = "robust")
The boot_level argument ("full" or "strata") also applies here when using bootstrap variance, with the same interpretation as in surveff().
The summary() method provides detailed results with confidence intervals on both log and original scales:
summary(hr_multi, conf_level = 0.95)
Output styles:
style = "prints" (default): Displays formatted tables with log hazard ratios and exponentiated hazard ratiosstyle = "returns": Returns a list containing:logHR, logHR_CI_lower, logHR_CI_upper: Log-scale estimates and CIsHR, HR_CI_lower, HR_CI_upper: Original-scale estimates and CIsSE: Standard errors for log-scale hazard ratio (from specified variance method)n_per_group, events_per_group: Sample sizes and event countssumm_hr <- summary(hr_multi, style = "returns") names(summ_hr) summ_hr$HR # Exponentiated hazard ratios
| Feature | surveff() | marCoxph() |
|---------|-------------|--------------|
| Output | Survival curves, survival differences | Hazard ratios |
| Estimands | ATE, ATT, overlap | ATE, ATT, overlap |
| Censoring adjustment | Weibull or Cox | None (standard Cox) |
| Variance | Analytical or bootstrap | Robust or bootstrap |
| Trimming | Symmetric, asymmetric | Symmetric, asymmetric |
| Multiple groups | Yes (with contrast_matrix) | Yes |
Cheng, C., Li, F., Thomas, L. E., & Li, F. (2022). Addressing extreme propensity scores in estimating counterfactual survival functions via the overlap weights. American Journal of Epidemiology, 191(6), 1140-1151.
Li, F., & Li, F. (2019). Propensity score weighting for causal inference with multiple treatments. The Annals of Applied Statistics, 13(4), 2389-2415.
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.