rctglm | R Documentation |
The procedure uses plug-in-estimation and influence functions to perform robust inference of any specified estimand in the setting of a randomised clinical trial, even in the case of heterogeneous effect of covariates in randomisation groups. See Powering RCTs for marginal effects with GLMs using prognostic score adjustment by Højbjerre-Frandsen et. al (2025) for more details on methodology.
rctglm(
formula,
exposure_indicator,
exposure_prob,
data,
family = gaussian,
estimand_fun = "ate",
estimand_fun_deriv0 = NULL,
estimand_fun_deriv1 = NULL,
cv_variance = FALSE,
cv_variance_folds = 10,
verbose = options::opt("verbose"),
...
)
formula |
an object of class "formula" (or one that can be coerced to that class): a symbolic description of the model to be fitted. The details of model specification are given under ‘Details’ in the glm documentation. |
exposure_indicator |
(name of) the binary variable in |
exposure_prob |
a |
data |
an optional data frame, list or environment (or object coercible by as.data.frame to a data frame) containing the variables in the model. If not found in data, the variables are taken from environment(formula), typically the environment from which the function is called. |
family |
a description of the error distribution and link
function to be used in the model. For |
estimand_fun |
a |
estimand_fun_deriv0 |
a |
estimand_fun_deriv1 |
a |
cv_variance |
a |
cv_variance_folds |
a |
verbose |
|
... |
Additional arguments passed to |
The procedure assumes the setup of a randomised clinical trial with observations grouped by a binary
exposure_indicator
variable, allocated randomly with probability exposure_prob
. A GLM is
fit and then used to predict the response of all observations in the event that the exposure_indicator
is 0 and 1, respectively. Taking means of these predictions produce the counterfactual means
psi0
and psi1
, and an estimand r(psi0, psi1)
is calculated using any specified estimand_fun
.
The variance of the estimand is found by taking the variance of the influence function of the estimand.
If cv_variance
is TRUE
, then the counterfactual predictions for each observation (which are
used to calculate the value of the influence function) is obtained as out-of-sample (OOS) predictions
using cross validation with number of folds specified by cv_variance_folds
. The cross validation splits
are performed using stratified sampling with exposure_indicator
as the strata
argument in rsample::vfold_cv.
Read more in vignette("model-fit")
.
rctglm
returns an object of class inheriting from "rctglm"
.
An object of class rctglm
is a list containing the following components:
estimand
: A data.frame
with plug-in estimate of estimand, standard
error (SE) estimate and variance estimate of estimand
estimand_funs
: A list
with
f
: The estimand_fun
used to obtain an estimate of the estimand from counterfactual means
d0
: The derivative with respect to psi0
d1
: The derivative with respect to psi1
means_counterfactual
: A data.frame
with counterfactual means psi0
and psi1
fitted.values_counterfactual
: A data.frame
with counterfactual mean
values, obtained by transforming the linear predictors for each group
by the inverse of the link function.
glm
: A glm
object returned from running stats::glm within the procedure
call
: The matched call
As noted in the description, psi0
and psi1
are the counterfactual means found by prediction using
a fitted GLM in the binary groups defined by exposure_indicator
.
Default estimand functions can be specified via "ate"
(which uses the function
function(psi1, psi0) psi1-psi0
) and "rate_ratio"
(which uses the function
function(psi1, psi0) psi1/psi0
). See more information on specifying the estimand_fun
in vignette("model-fit")
.
As a default, the Deriv
package is used to perform symbolic differentiation to find the derivatives of
the estimand_fun
.
See how to extract information using methods in rctglm_methods.
Use rctglm_with_prognosticscore()
to include prognostic covariate adjustment.
See vignettes
# Generate some data to showcase example
n <- 100
exp_prob <- .5
dat_gaus <- glm_data(
Y ~ 1+1.5*X1+2*A,
X1 = rnorm(n),
A = rbinom(n, 1, exp_prob),
family = gaussian()
)
# Fit the model
ate <- rctglm(formula = Y ~ .,
exposure_indicator = A,
exposure_prob = exp_prob,
data = dat_gaus,
family = gaussian)
# Pull information on estimand
estimand(ate)
## Another example with different family and specification of estimand_fun
dat_binom <- glm_data(
Y ~ 1+1.5*X1+2*A,
X1 = rnorm(n),
A = rbinom(n, 1, exp_prob),
family = binomial()
)
rr <- rctglm(formula = Y ~ .,
exposure_indicator = A,
exposure_prob = exp_prob,
data = dat_binom,
family = binomial(),
estimand_fun = "rate_ratio")
odds_ratio <- function(psi1, psi0) (psi1*(1-psi0))/(psi0*(1-psi1))
or <- rctglm(formula = Y ~ .,
exposure_indicator = A,
exposure_prob = exp_prob,
data = dat_binom,
family = binomial,
estimand_fun = odds_ratio)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.