Introduction to postcard"
In postcard: Estimating Marginal Effects with Prognostic Covariate Adjustment

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

Setup-chunk to load the package, set a seed and turn off verbosity for the rendering of the vignette.

library(postcard)
withr::local_seed(1395878)
withr::local_options(list(postcard.verbose = 0))

postcard provides tools for accurately estimating marginal effects using plug-in estimation with GLMs, including increasing precision using prognostic covariate adjustment. See Powering RCTs for marginal effects with GLMs using prognostic score adjustment by Højbjerre-Frandsen et. al (2025).

Plug-in estimation of marginal effects and variance estimation using influence functions

The use of plug-in estimation and influence functions can help us obtain more accurate estimates. Coupled with prognostic covariate adjustment, we can increase the precision of our estimates and obtain a higher power with sacrificing control over the type I error rate.

Introductory examples on the use of rctglm() and rctglm_with_prognosticscore() functions are available here. For more details, see vignette("model-fit").

Simulating data for exploratory analyses

First, we simulate some data to be able to enable showcasing of the functionalities. For this we use the glm_data() function from the package, where the user can specify an expression alongside variables and a family of the response to then simulate a response from a GLM with linear predictor given by the expression provided.

n <- 1000
b0 <- 1
b1 <- 3
b2 <- 2

# Simulate data with a non-linear effect
dat_treat <- glm_data(
  Y ~ b0+b1*sin(W)^2+b2*A,
  W = runif(n, min = -2, max = 2),
  A = rbinom(n, 1, prob = 1/2),
  family = gaussian() # Default value
)

Fitting `rctglm()` without prognostic covariate adjustment

The rctglm() function estimates any specified estimand using plug-in estimation for randomised clinical trials and estimates the variance using the influence function of the marginal effect estimand.

The interface of rctglm() is similar to that of the stats::glm() function but with an added mandatory specification of

The randomisation variable in data, usually being the (name of) the treatment variable
The randomisation ratio - the probability of being allocated to group 1 (rather than 0)
- As a default, a ratio of 1's in data is used
An estimand function
- As a default, the function takes the average treatment effect (ATE) as the estimand

Thus, we can estimate the ATE by simply writing the below:

Note that as a default, verbose = 2, meaning that information about the algorithm is printed to the console. However, here we suppress this behavior. See more in vignette("model-fit").

ate <- rctglm(formula = Y ~ A * W,
              exposure_indicator = A,
              exposure_prob = 1/2,
              data = dat_treat,
              family = "gaussian") # Default value

This creates an rctglm object which prints as

ate

The structure of such an rctglm object is broken down in the Value section of the documentation in rctglm().

Methods available are estimand (or the shorthand est) which prints a data.frame with and estimate of the estimand and its standard error. A method for coef is also available to extract coefficients from the underlying glm fit.

est(ate)

See more info in the documentation page rctglm_methods().

Using prognostic covariate adjustment

The rctglm_with_prognosticscore() function uses the fit_best_learner() function to fit a prognostic model to historical data and then uses the prognostic model to predict \begin{align} \mathbb{E}[Y|X,A=0] \end{align}

for all observations in the current data set. These prognostic scores are then used as a covariate in the GLM when running rctglm().

Allowing the use of complex non-linear models to create such a prognostic score allows utilising information from potentially many variables, "catching" non-linear relationships and then using all this information in the GLM model using a single covariate adjustment.

We simulate some historical data to showcase the use of this function as well:

dat_notreat <- glm_data(
  Y ~ b0+b1*sin(W)^2,
  W = runif(n, min = -2, max = 2),
  family = gaussian # Default value
)

The call to rctglm_with_prognosticscore() is the same as to rctglm() but with an added specification of

(Historical) data to fit the prognostic model using fit_best_learner()
A formula used when fitting the prognostic model
- Default uses all covariates in the data.
(Optionally) number folds in cross validation and a list of learners for fitting the best learner

Thus, a simple call which estimates the average treatment effect, adjusting for a prognostic score, is seen below:

ate_prog <- rctglm_with_prognosticscore(
  formula = Y ~ A * W,
  exposure_indicator = A,
  exposure_prob = 1/2,
  data = dat_treat,
  family = gaussian(link = "identity"), # Default value
  data_hist = dat_notreat)

Quick results of the fit can be seen by printing the object:

ate_prog

It's evident that in this case where there is a non-linear relationship between the covariate we observe and the response, adjusting for the prognostic score reduces the standard error of our estimand approximation by quite a bit.

Investigating the prognostic model

Information on the prognostic model is available in the list element prognostic_info, which the method prog() can be used to extract. A breakdown of what this list includes, see the Value section of the rctglm_with_prognosticscore() documentation.

Power approximation

In cases of seeking to conduct new studies, sample size/power analyses are vital to the successful planning of such studies. Here, we present implementations in this package that take advantage of power approximation formulas to perform such analyses.

See a more detailed walkthrough of a use case in vignette("prospective-power").

For marginal effects

The method proposed in Powering RCTs for marginal effects with GLMs using prognostic score adjustment by Højbjerre-Frandsen et. al (2025), which can be used to estimate the power when estimating any marginal effect, is implemented in the function power_marginaleffect().

According to the conservative approach in the article, if wanting to conduct power analyses to figure out how many participants is needed for an upcoming trial, where you are planning to use prognostic covariate adjustment, predictions should be obtained from a discrete super learner identical to the one planned to use for generating prognostic scores when adjusting in the analysis when estimating the marginal effect.

Here we showcase a simple use of a glm(), but fx. fit_best_learner() could be used to fit a discrete super learner as the prediction model. Could also add steps to get out-of-sample (OOS) predictions (see examples).

pred_mod <- glm(Y ~ W + A, data = dat_treat)
preds <- predict(pred_mod, dat_treat)

power_marginaleffect(
  response = dat_treat$Y,
  predictions = preds,
  target_effect = 0.4,
  exposure_prob = 1/2
)

Specific to linear models

Estimating the assumed variance

Finding the assumed variance to use for your power analysis with an ANCOVA model can be done using the variance_ancova function, which estimates the term $\sigma^2(1-R^2)$ given a formula and data.

vanc <- variance_ancova(Y ~ A + W, data = dat_treat)
vanc

Finding the power (or sample size)

Functions power_gs() and power_nc() exist, which estimate the power given a sample size n using approximation formulas. The functions are the results of two different approximation formulas but behave exactly the same except for a mandatory specification of a df argument for the power_nc function, which gives the degrees of freedom in the t-distribution used.

Details about the formulas are available in the documentation

For the Guenther-Schouten approximation, the formula directly gives us a sample size as a function of the power, so getting the required sample size as a function of the power is available in the function samplesize_gs().

power_gs(variance = vanc, n = 100, ate = 0.8)
power_nc(variance = vanc, n = 100, df = 97, ate = 0.8)
samplesize_gs(variance = vanc, ate = 0.8, power = 0.9)

Any scripts or data that you put into this service are public.

postcard documentation built on April 12, 2025, 1:57 a.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

postcard
Estimating Marginal Effects with Prognostic Covariate Adjustment

Introduction to postcard"
In postcard: Estimating Marginal Effects with Prognostic Covariate Adjustment

Plug-in estimation of marginal effects and variance estimation using influence functions

Simulating data for exploratory analyses

Fitting `rctglm()` without prognostic covariate adjustment

Using prognostic covariate adjustment

Investigating the prognostic model

Power approximation

For marginal effects

Specific to linear models

Estimating the assumed variance

Finding the power (or sample size)

Try the postcard package in your browser

R Package Documentation

Browse R Packages

We want your feedback!

postcard Estimating Marginal Effects with Prognostic Covariate Adjustment

Introduction to postcard" In postcard: Estimating Marginal Effects with Prognostic Covariate Adjustment

Plug-in estimation of marginal effects and variance estimation using influence functions

Simulating data for exploratory analyses

Fitting rctglm() without prognostic covariate adjustment

Using prognostic covariate adjustment

Investigating the prognostic model

Power approximation

For marginal effects

Specific to linear models

Estimating the assumed variance

Finding the power (or sample size)

Try the postcard package in your browser

R Package Documentation

Browse R Packages

We want your feedback!

postcard
Estimating Marginal Effects with Prognostic Covariate Adjustment

Introduction to postcard"
In postcard: Estimating Marginal Effects with Prognostic Covariate Adjustment

Fitting `rctglm()` without prognostic covariate adjustment