standardize_glm: Get regression standardized estimates from a glm
In stdReg2: Regression Standardization for Causal Inference

standardize_glm

R Documentation

Get regression standardized estimates from a glm

Description

Get regression standardized estimates from a glm

Usage

standardize_glm(
  formula,
  data,
  values,
  clusterid,
  matched_density_cases,
  matched_density_controls,
  matching_variable,
  p_population,
  case_control = FALSE,
  ci_level = 0.95,
  ci_type = "plain",
  contrasts = NULL,
  family = "gaussian",
  reference = NULL,
  transforms = NULL
)

Arguments

`formula`	The formula which is used to fit the model for the outcome.
`data`	The data.
`values`	A named list or data.frame specifying the variables and values at which marginal means of the outcome will be estimated.
`clusterid`	An optional string containing the name of a cluster identification variable when data are clustered.
`matched_density_cases`	A function of the matching variable. The probability (or density) of the matched variable among the cases.
`matched_density_controls`	A function of the matching variable. The probability (or density) of the matched variable among the controls.
`matching_variable`	The matching variable extracted from the data set.
`p_population`	Specifies the incidence in the population when `case_control=TRUE`.
`case_control`	Whether the data comes from a case-control study.
`ci_level`	Coverage probability of confidence intervals.
`ci_type`	A string, indicating the type of confidence intervals. Either "plain", which gives untransformed intervals, or "log", which gives log-transformed intervals.
`contrasts`	A vector of contrasts in the following format: If set to `"difference"` or `"ratio"`, then `\psi(x)-\psi(x_0)` or `\psi(x) / \psi(x_0)` are constructed, where `x_0` is a reference level specified by the `reference` argument. Has to be `NULL` if no references are specified.
`family`	The family argument which is used to fit the glm model for the outcome.
`reference`	A vector of reference levels in the following format: If `contrasts` is not `NULL`, the desired reference level(s). This must be a vector or list the same length as `contrasts`, and if not named, it is assumed that the order is as specified in contrasts.
`transforms`	A vector of transforms in the following format: If set to `"log"`, `"logit"`, or `"odds"`, the standardized mean `\theta(x)` is transformed into `\psi(x)=\log\{\theta(x)\}`, `\psi(x)=\log[\theta(x)/\{1-\theta(x)\}]`, or `\psi(x)=\theta(x)/\{1-\theta(x)\}`, respectively. If the vector is `NULL`, then `\psi(x)=\theta(x)`.

Details

standardize_glm performs regression standardization in generalized linear models, at specified values of the exposure, over the sample covariate distribution. Let Y, X, and Z be the outcome, the exposure, and a vector of covariates, respectively. standardize_glm uses a fitted generalized linear model to estimate the standardized mean \theta(x)=E\{E(Y|X=x,Z)\}, where x is a specific value of X, and the outer expectation is over the marginal distribution of Z.

Value

An object of class std_glm. Obtain numeric results in a data frame with the tidy.std_glm function. This is a list with the following components:

res_contrast

An unnamed list with one element for each of the requested contrasts. Each element is itself a list with the elements:

estimates: Estimated counterfactual means and standard errors for each exposure level
covariance: Estimated covariance matrix of counterfactual means
fit_outcome: The estimated regression model for the outcome
fit_exposure: The estimated exposure model
exposure_names: A character vector of the exposure variable names
est_table: Data.frame of the estimates of the contrast with inference
transform: The transform argument used for this contrast
contrast: The requested contrast type
reference: The reference level of the exposure
ci_type: Confidence interval type
ci_level: Confidence interval level

res

A named list with the elements:

estimates: Estimated counterfactual means and standard errors for each exposure level
covariance: Estimated covariance matrix of counterfactual means
fit_outcome: The estimated regression model for the outcome
fit_exposure: The estimated exposure model
exposure_names: A character vector of the exposure variable names

References

Rothman K.J., Greenland S., Lash T.L. (2008). Modern Epidemiology, 3rd edition. Lippincott, Williams & Wilkins.

Sjölander A. (2016). Regression standardization with the R-package stdReg. European Journal of Epidemiology 31(6), 563-574.

Sjölander A. (2016). Estimation of causal effect measures with the R-package stdReg. European Journal of Epidemiology 33(9), 847-858.

Examples


# basic example
# needs to correctly specify the outcome model and no unmeasered confounders
# (+ standard causal assunmptions)
set.seed(6)
n <- 100
Z <- rnorm(n)
X <- cut(rnorm(n, mean = Z), breaks = c(-Inf, 0, Inf), labels = c("low", "high"))
Y <- rbinom(n, 1, prob = (1 + exp(as.numeric(X) + Z))^(-1))
dd <- data.frame(Z, X, Y)
x <- standardize_glm(
  formula = Y ~ X * Z,
  family = "binomial",
  data = dd,
  values = list(X = c("low", "high")),
  contrasts = c("difference", "ratio"),
  reference = "low"
)
x
# different transformations of causal effects

# example from Sjölander (2016) with case-control data
# here the matching variable needs to be passed as an argument
singapore <- AF::singapore
Mi <- singapore$Age
m <- mean(Mi)
s <- sd(Mi)
d <- 5
standardize_glm(
  formula = Oesophagealcancer ~ (Everhotbev + Age + Dial + Samsu + Cigs)^2,
  family = binomial, data = singapore,
  values = list(Everhotbev = 0:1), clusterid = "Set",
  case_control = TRUE,
  matched_density_cases = function(x) dnorm(x, m, s),
  matched_density_controls = function(x) dnorm(x, m - d, s),
  matching_variable = Mi,
  p_population = 19.3 / 100000
)

# multiple exposures
set.seed(7)
n <- 100
Z <- rnorm(n)
X1 <- rnorm(n, mean = Z)
X2 <- rnorm(n)
Y <- rbinom(n, 1, prob = (1 + exp(X1 + X2 + Z))^(-1))
dd <- data.frame(Z, X1, X2, Y)
x <- standardize_glm(
  formula = Y ~ X1 + X2 + Z,
  family = "binomial",
  data = dd, values = list(X1 = 0:1, X2 = 0:1),
  contrasts = c("difference", "ratio"),
  reference = c(X1 = 0, X2 = 0)
)
x
tidy(x)

# continuous exposure
set.seed(2)
n <- 100
Z <- rnorm(n)
X <- rnorm(n, mean = Z)
Y <- rnorm(n, mean = X + Z + 0.1 * X^2)
dd <- data.frame(Z, X, Y)
x <- standardize_glm(
  formula = Y ~ X * Z,
  family = "gaussian",
  data = dd,
  values = list(X = seq(-1, 1, 0.1))
)

# plot standardized mean as a function of x
plot(x)
# plot standardized mean - standardized mean at x = 0 as a function of x
plot(x, contrast = "difference", reference = 0)

stdReg2 documentation built on April 13, 2025, 5:12 p.m.