View source: R/impact_fraction.R
impact_fraction | R Documentation |
General calculations of impact fractions
impact_fraction(
model,
data,
new_data,
calculation_method = "B",
prev = NULL,
ci = FALSE,
boot_rep = 50,
t_vector = NULL,
ci_level = 0.95,
ci_type = c("norm"),
weight_vec = NULL,
verbose = TRUE
)
model |
Either a clogit, glm or coxph fitted model object. Non-linear effects should be specified via ns(x, df=y), where ns is the natural spline function from the splines library. |
data |
A dataframe containing variables used for fitting the model |
new_data |
A dataframe (of the same variables and size as data) representing an alternative distribution of risk factors |
calculation_method |
A character either 'B' (Bruzzi) or 'D' (Direct method). For case control data, the method described in Bruzzi 1985 is recommended. Bruzzi's method estimates PAF from relative risks and prevalence of exposure to the risk factor. The Direct method estimates PAF by summing estimated probabilities of disease in the absence of exposure on the individual level |
prev |
estimated prevalence of disease. This only needs to be specified if the data source is from a case control study, and the direct method is used |
ci |
Logical. If TRUE, a bootstrap confidence interval is computed along with point estimate (default FALSE) |
boot_rep |
Integer. Number of bootstrap replications (Only necessary to specify if ci=TRUE) |
t_vector |
Numeric. A vector of times at which to calculate PAF (only specified if model is coxph) |
ci_level |
Numeric. Default 0.95. A number between 0 and 1 specifying the confidence level |
ci_type |
Character. Default norm. A vector specifying the types of confidence interval desired. "norm", "basic", "perc" and "bca" are the available methods |
weight_vec |
An optional vector of inverse sampling weights for survey data (note that variance will not be calculated correctly if sampling isn't independent). Note that this vector will be ignored if prev is specified, and the weights will be calibrated so that the weighted sample prevalence of disease equals prev. |
verbose |
A logical indicator for whether extended output is produced when ci=TRUE, default TRUE |
A numeric estimated impact fraction if ci=FALSE, or for survival data a vector of estimated impact corresponding to event times in the data. If ci=TRUE, estimated impact fractions and other information are bundled into an object of class IF_summary.
Bruzzi, P., Green, S.B., Byar, D.P., Brinton, L.A. and Schairer, C., 1985. Estimating the population attributable risk for multiple risk factors using case-control data. American journal of epidemiology, 122(5), pp.904-914
library(splines)
library(survival)
new_data <- stroke_reduced
N <- nrow(new_data)
inactive_patients <- (1:N)[stroke_reduced$exercise==1]
N_inactive <- sum(stroke_reduced$exercise)
newly_active_patients <- inactive_patients[sample(1:N_inactive,0.2*N_inactive)]
new_data$exercise[newly_active_patients] <- 0
model_exercise <- clogit(formula = case ~ age + education +exercise +
ns(diet, df = 3) + smoking + alcohol + stress + ns(lipids,df = 3) +
ns(waist_hip_ratio, df = 3) + high_blood_pressure +strata(strata),
data=stroke_reduced)
impact_fraction(model=model_exercise,stroke_reduced,new_data,
calculation_method = "B")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.