Description Usage Arguments Details Value Author(s) References See Also Examples
AFglm
estimates the model-based adjusted attributable fraction for data from a logistic regression model in the form of a glm
object. This model is commonly used for data from a cross-sectional or non-matched case-control sampling design.
1 |
object |
a fitted logistic regression model object of class " |
data |
an optional data frame, list or environment (or object coercible by |
exposure |
the name of the exposure variable as a string. The exposure must be binary (0/1) where unexposed is coded as 0. |
clusterid |
the name of the cluster identifier variable as a string, if data are clustered. Cluster robust standard errors will be calculated. |
case.control |
can be set to |
AFglm
estimates the attributable fraction for a binary outcome Y
under the hypothetical scenario where a binary exposure X
is eliminated from the population.
The estimate is adjusted for confounders Z
by logistic regression using the (glm
) function.
The estimation strategy is different for cross-sectional and case-control sampling designs even if the underlying logististic regression model is the same.
For cross-sectional sampling designs the AF can be defined as
AF = 1 - Pr(Y0 = 1) / Pr(Y = 1)
where Pr(Y0 = 1) denotes the counterfactual probability of the outcome if
the exposure would have been eliminated from the population and Pr(Y = 1) denotes the factual probability of the outcome.
If Z
is sufficient for confounding control, then Pr(Y0 = 1) can be expressed as
E_z{Pr(Y = 1 |X = 0,Z)}.
The function uses logistic regression to estimate Pr(Y=1|X=0,Z), and the marginal sample distribution of Z
to approximate the outer expectation (Sj<c3><b6>lander and Vansteelandt, 2012).
For case-control sampling designs the outcome prevalence is fixed by sampling design and absolute probabilities (P.est
and P0.est
) can not be estimated.
Instead adjusted log odds ratios (log.or
) are estimated for each individual.
This is done by setting case.control
to TRUE
. It is then assumed that the outcome is rare so that the risk ratio can be approximated by the odds ratio.
For case-control sampling designs the AF be defined as (Bruzzi et. al)
AF = 1 - Pr(Y0 = 1) / Pr(Y = 1)
where Pr(Y0 = 1) denotes the counterfactual probability of the outcome if
the exposure would have been eliminated from the population. If Z
is sufficient for confounding control then the probability Pr(Y0 = 1) can be expressed as
Pr(Y0=1) = E_z{Pr(Y = 1 | X = 0, Z)}.
Using Bayes' theorem this implies that the AF can be expressed as
AF = 1 - E_z{Pr( Y = 1 | X = 0, Z)} / Pr(Y = 1) = 1 - E_z{RR^{-X} (Z) | Y = 1}
where RR(Z) is the risk ratio
Pr(Y = 1 | X = 1,Z)/Pr(Y=1 | X = 0, Z).
Moreover, the risk ratio can be approximated by the odds ratio if the outcome is rare. Thus,
AF is approximately 1 - E_z{OR^{-X}(Z) | Y = 1}.
If clusterid
is supplied, then a clustered sandwich formula is used in all variance calculations.
AF.est |
estimated attributable fraction. |
AF.var |
estimated variance of |
P.est |
estimated factual proportion of cases; Pr(Y=1). Returned by default when |
P.var |
estimated variance of |
P0.est |
estimated counterfactual proportion of cases if exposure would be eliminated; Pr(Y0=1). Returned by default when |
P0.var |
estimated variance of |
log.or |
a vector of the estimated log odds ratio for every individual. logit {Pr(Y=1|X,Z)} = α + β X + γ Z then logit{Pr(Y=1|X,Z)} = α + β X +γ Z +ψ XZ then |
Elisabeth Dahlqwist, Arvid Sj<c3><b6>lander
Bruzzi, P., Green, S. B., Byar, D., Brinton, L. A., and Schairer, C. (1985). Estimating the population attributable risk for multiple risk factors using case-control data. American Journal of Epidemiology 122, 904-914.
Greenland, S. and Drescher, K. (1993). Maximum Likelihood Estimation of the Attributable Fraction from logistic Models. Biometrics 49, 865-872.
Sj<c3><b6>lander, A. and Vansteelandt, S. (2011). Doubly robust estimation of attributable fractions. Biostatistics 12, 112-121.
glm
used for fitting the logistic regression model. For conditional logistic regression (commonly for data from a matched case-control sampling design) see AFclogit
.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 | # Simulate a cross-sectional sample
expit <- function(x) 1 / (1 + exp( - x))
n <- 1000
Z <- rnorm(n = n)
X <- rbinom(n = n, size = 1, prob = expit(Z))
Y <- rbinom(n = n, size = 1, prob = expit(Z + X))
# Example 1: non clustered data from a cross-sectional sampling design
data <- data.frame(Y, X, Z)
# Fit a glm object
fit <- glm(formula = Y ~ X + Z + X * Z, family = binomial, data = data)
# Estimate the attributable fraction from the fitted logistic regression
AFglm_est <- AFglm(object = fit, data = data, exposure = "X")
summary(AFglm_est)
# Example 2: clustered data from a cross-sectional sampling design
# Duplicate observations in order to create clustered data
id <- rep(1:n, 2)
data <- data.frame(id = id, Y = c(Y, Y), X = c(X, X), Z = c(Z, Z))
# Fit a glm object
fit <- glm(formula = Y ~ X + Z + X * Z, family = binomial, data = data)
# Estimate the attributable fraction from the fitted logistic regression
AFglm_clust <- AFglm(object = fit, data = data,
exposure = "X", clusterid = "id")
summary(AFglm_clust)
# Example 3: non matched case-control
# Simulate a sample from a non matched case-control sampling design
# Make the outcome a rare event by setting the intercept to -6
expit <- function(x) 1 / (1 + exp( - x))
NN <- 1000000
n <- 500
intercept <- -6
Z <- rnorm(n = NN)
X <- rbinom(n = NN, size = 1, prob = expit(Z))
Y <- rbinom(n = NN, size = 1, prob = expit(intercept + X + Z))
population <- data.frame(Z, X, Y)
Case <- which(population$Y == 1)
Control <- which(population$Y == 0)
# Sample cases and controls from the population
case <- sample(Case, n)
control <- sample(Control, n)
data <- population[c(case, control), ]
# Fit a glm object
fit <- glm(formula = Y ~ X + Z + X * Z, family = binomial, data = data)
# Estimate the attributable fraction from the fitted logistic regression
AFglm_est_cc <- AFglm(object = fit, data = data, exposure = "X", case.control = TRUE)
summary(AFglm_est_cc)
|
Loading required package: survival
Loading required package: drgee
Loading required package: nleqslv
Loading required package: Rcpp
Loading required package: data.table
Loading required package: stdReg
Call:
glm(formula = Y ~ X + Z + X * Z, family = binomial, data = data)
Estimated attributable fraction (AF) and untransformed 95% Wald CI:
AF Std.Error z value Pr(>|z|) Lower limit Upper limit
0.1510194 0.02967754 5.088676 3.605721e-07 0.09285246 0.2091863
Exposure : X
Outcome : Y
Observations Cases
1000 574
Method for confounder adjustment: Logistic regression
Formula: Y ~ X + Z + X * Z
Call:
glm(formula = Y ~ X + Z + X * Z, family = binomial, data = data)
Estimated attributable fraction (AF) and untransformed 95% Wald CI:
AF Robust SE z value Pr(>|z|) Lower limit Upper limit
0.1510194 0.02967754 5.088676 3.605721e-07 0.09285246 0.2091863
Exposure : X
Outcome : Y
Observations Cases Clusters
2000 1148 1000
Method for confounder adjustment: Logistic regression
Formula: Y ~ X + Z + X * Z
Call:
glm(formula = Y ~ X + Z + X * Z, family = binomial, data = data)
Estimated attributable fraction (AF) and untransformed 95% Wald CI:
AF Std.Error z value Pr(>|z|) Lower limit Upper limit
0.4836923 0.1005624 4.809874 1.510258e-06 0.2865937 0.6807909
Exposure : X
Outcome : Y
Observations Cases
1000 500
Method for confounder adjustment: Logistic regression
Formula: Y ~ X + Z + X * Z
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.