ps_trunc: Truncate (Winsorize) Propensity Scores

View source: R/ps_trunc.R

ps_truncR Documentation

Truncate (Winsorize) Propensity Scores

Description

ps_trunc() bounds extreme propensity scores to fixed limits, replacing out-of-range values with the boundary value (a form of winsorizing). The result is a vector or matrix of the same length and dimensions as ps, with no observations removed. This contrasts with ps_trim(), which sets extreme values to NA (effectively removing those observations from analysis).

Usage

ps_trunc(
  ps,
  method = c("ps", "pctl", "cr"),
  lower = NULL,
  upper = NULL,
  .exposure = NULL,
  .focal_level = NULL,
  .reference_level = NULL,
  ...,
  .treated = NULL,
  .untreated = NULL
)

Arguments

ps

A numeric vector of propensity scores between 0 and 1 (binary exposures), or a matrix/data.frame where each column contains propensity scores for one level of a categorical exposure.

method

One of "ps", "pctl", or "cr":

  • "ps" (default): Truncate directly on propensity score values. Values outside ⁠[lower, upper]⁠ are set to the nearest bound. For categorical exposures, applies symmetric truncation using lower as the threshold (delta) and renormalizes rows to sum to 1.

  • "pctl": Truncate at quantiles of the propensity score distribution. The lower and upper arguments specify quantile probabilities. For categorical exposures, quantiles are computed across all columns.

  • "cr": Truncate to the common range of propensity scores across exposure groups (binary exposures only). Bounds are ⁠[min(ps[focal]), max(ps[reference])]⁠. Requires .exposure.

lower, upper

Bounds for truncation. Interpretation depends on method:

  • method = "ps": Propensity score values (defaults: 0.1 and 0.9). For categorical exposures, lower is the truncation threshold delta (default: 0.01); upper is ignored.

  • method = "pctl": Quantile probabilities (defaults: 0.05 and 0.95; categorical defaults: 0.01 and 0.99).

  • method = "cr": Ignored; bounds are determined by the data.

.exposure

An exposure vector. Required for method "cr" (binary exposure vector) and for categorical exposures (factor or character vector) with any method.

.focal_level

The value of .exposure representing the focal (treated) group. For binary exposures, defaults to the higher value. Required for wt_att() and wt_atu() with categorical exposures.

.reference_level

The value of .exposure representing the reference (control) group. Automatically detected if not supplied.

...

Additional arguments passed to methods.

.treated

[Deprecated] Use .focal_level instead.

.untreated

[Deprecated] Use .reference_level instead.

Details

Unlike ps_trim(), truncation preserves all observations. No NA values are introduced; out-of-range scores are replaced with the boundary value.

For binary exposures, each propensity score e_i is bounded:

  • If e_i < l, set e_i = l (the lower bound).

  • If e_i > u, set e_i = u (the upper bound).

For categorical exposures, values below the threshold are set to the threshold and each row is renormalized to sum to 1.

Arithmetic behavior: Arithmetic operations on ps_trunc objects return plain numeric vectors. Once propensity scores are transformed (e.g., into weights), the result is no longer a propensity score.

Combining behavior: Combining ps_trunc objects with c() requires matching truncation parameters. Mismatched parameters produce a warning and return a plain numeric vector.

Value

A ps_trunc object (a numeric vector for binary exposures, or a matrix for categorical exposures). Use ps_trunc_meta() to inspect metadata including method, lower_bound, upper_bound, and truncated_idx (positions of modified values).

References

Crump, R. K., Hotz, V. J., Imbens, G. W., & Mitnik, O. A. (2009). Dealing with limited overlap in estimation of average treatment effects. Biometrika, 96(1), 187–199.

Walker, A. M., Patrick, A. R., Lauer, M. S., et al. (2013). A tool for assessing the feasibility of comparative effectiveness research. Comparative Effectiveness Research, 3, 11–20.

See Also

ps_trim() for removing (rather than bounding) extreme values, ps_refit() for refitting the propensity model after trimming, is_ps_truncated(), is_unit_truncated(), ps_trunc_meta()

Examples

set.seed(2)
n <- 200
x <- rnorm(n)
z <- rbinom(n, 1, plogis(1.2 * x))
fit <- glm(z ~ x, family = binomial)
ps <- predict(fit, type = "response")

# Truncate to [0.1, 0.9]
ps_t <- ps_trunc(ps, method = "ps", lower = 0.1, upper = 0.9)
ps_t

# Truncate at the 1st and 99th percentiles
ps_trunc(ps, method = "pctl", lower = 0.01, upper = 0.99)

# Use truncated scores to calculate weights
wt_ate(ps_t, .exposure = z)

# Inspect which observations were truncated
is_unit_truncated(ps_t)


propensity documentation built on March 3, 2026, 1:06 a.m.