pif: Potential Impact Fraction
In pifpaf: Potential Impact Fraction and Population Attributable Fraction for Cross-Sectional Data

Description Usage Arguments Details Value Note Author(s) References See Also Examples

Function for estimating the Potential Impact Fraction pif from a cross-sectional sample of the exposure X with known Relative Risk function rr with parameter theta, where the Potential Impact Fraction is given by:

PIF = (mean(rr(X; theta)) - mean(rr(cft(X);theta)))/mean(rr(X; theta)) .

pif(X, thetahat, rr, cft = NA, method = "empirical",
  weights = rep(1/nrow(as.matrix(X)), nrow(as.matrix(X))), Xvar = var(X),
  deriv.method.args = list(), deriv.method = "Richardson", adjust = 1,
  n = 512, ktype = "gaussian", bw = "SJ", check_exposure = TRUE,
  check_integrals = TRUE, check_rr = TRUE, is_paf = FALSE)

`X`	Random sample (`data.frame`) which includes exposure and covariates or sample `mean` if `"approximate"` method is selected.
`thetahat`	Asymptotically consistent or Fisher consistent estimator (`vector`) of `theta` for the Relative Risk function.
`rr`	`function` for Relative Risk which uses parameter `theta`. The order of the parameters should be `rr(X, theta)`. Optional
`cft`	Function `cft(X)` for counterfactual. Leave empty for the Population Attributable Fraction `paf` where counterfactual is that of a theoretical minimum risk exposure `X0` such that `rr(X0,theta) = 1`.
`method`	Either `"empirical"` (default), `"kernel"` or `"approximate"`.
`weights`	Normalized survey `weights` for the sample `X`.
`Xvar`	Variance of exposure levels (for `"approximate"` method).
`deriv.method.args`	`method.args` for `hessian` (for `"approximate"` method).
`deriv.method`	`method` for `hessian`. Don't change this unless you know what you are doing (for `"approximate"` method).
`adjust`	Adjust bandwith parameter (for `"kernel"` method) from `density`.
`n`	Number of equally spaced points at which the density (for `"kernel"` method) is to be estimated (see `density`).
`ktype`	`kernel` type: `"gaussian"`, `"epanechnikov"`, `"rectangular"`, `"triangular"`, `"biweight"`, `"cosine"`, `"optcosine"` (for `"kernel"` method). Additional information on kernels in `density`.
`bw`	Smoothing bandwith parameter (for `"kernel"` method) from `density`. Default `"SJ"`.
`check_exposure`	Check exposure `X` is positive and numeric.
`check_integrals`	Check that counterfactual of theoretical minimum risk exposure and relative risk's expected values are well defined for this scenario.
`check_rr`	Check that Relative Risk function `rr` equals `1` when evaluated at `0`.
`is_paf`	Boolean forcing evaluation of `paf`. This forces the `pif` function ignore the inputed counterfactual and set the relative risk to the theoretical minimum risk value of `1`.

The "empirical" method estimates the pif by

PIF = 1 - weighted.mean(rr(cft(X), theta), weights)/ weighted.mean(rr(cft(X), theta), weights).

The "kernel" method approximates the density of the exposure X and estimates its expected value from that approximation:

PIF = 1 - integrate(rr(cft(X), theta)*f(x))/integrate(rr(X, theta)*f(x)).

The "approximate" method conducts a Laplace approximation of the pif. Additional information on the methods is dicussed in the package's vignette: browseVignettes("pifpaf").

In practice "approximate" method should be the last choice. Simulations have shown that "empirical"'s convergence is faster than "kernel" for most functions. In addition, the scope of "kernel" is limited as it does not work with multivariate exposure data X.

pif Estimate of Potential Impact Fraction.

For more information on kernels see density.

Do not use the $ operator when using "approximate" method.

Rodrigo Zepeda-Tello rzepeda17@gmail.com

Dalia Camacho-Garc<c3><ad>a-Forment<c3><ad> daliaf172@gmail.com

Vander Hoorn, S., Ezzati, M., Rodgers, A., Lopez, A. D., & Murray, C. J. (2004). Estimating attributable burden of disease from exposure and hazard data. Comparative quantification of health risks: global and regional burden of disease attributable to selected major risk factors. Geneva: World Health Organization, 2129-40.

See pif.confidence for confidence interval estimation, and paf for Population Attributable Fraction estimation.

Sensitivity analysis plots can be done with pif.plot, pif.sensitivity, and pif.heatmap.

#Example 1: Exponential Relative Risk
#--------------------------------------------
set.seed(18427)
X        <- data.frame(Exposure = rnorm(100,3,1))
thetahat <- 0.12
rr       <- function(X, theta){exp(theta*X)}

#Without specifying counterfactual pif matches paf
pif(X, thetahat, rr)
paf(X, thetahat, rr)

#Same example with kernel method
pif(X, thetahat, rr, method = "kernel")

#Same example with approximate method
Xmean <- data.frame(Exposure = mean(X[,"Exposure"]))
Xvar  <- var(X[,"Exposure"])
pif(Xmean, thetahat, rr, method = "approximate", Xvar = Xvar)

#Same example considering counterfactual of halving exposure
cft   <- function(X){ 0.5*X }
pif(X, thetahat, rr, cft, method = "empirical")

#Example 2: Linear Relative Risk
#--------------------------------------------
set.seed(18427)
X        <- data.frame(Exposure = rbeta(100,3,1))
thetahat <- 0.12
rr       <- function(X, theta){theta*X + 1}
cft      <-  function(X){ 0.5*X }
weights             <- runif(100)
normalized_weights  <- weights/sum(weights)
pif(X, thetahat, rr, cft, weights = normalized_weights)

#Same example with more complex counterfactual that reduces 
#only the values > 0.75 are halved
cft       <- function(X){

   #Indentify the ones with "a lot" of exposure:
   where_excess_exposure    <- which(X[,"Exposure"] > 0.75)             
   
   #Halve their exposure
   X[where_excess_exposure, "Exposure"] <- 
            X[where_excess_exposure, "Exposure"]/2  
   return(X)
}
pif(X, thetahat, rr, cft, weights = normalized_weights)


#Example 3: Multivariate Linear Relative Risk
#--------------------------------------------
set.seed(18427)
X1       <- rnorm(100,4,1)
X2       <- rnorm(100,2,0.4)
X        <- data.frame(Exposure = X1, Covariate = X2)
thetahat <- c(0.12, 0.03)

#When creating relative risks and counterfactuals avoid using $ operator
#as it doesn't work under approximate method
rr_not    <- function(X, theta){
               exp(theta[1]*X$Exposure + theta[2]*X$Covariate)
             }
rr_better <- function(X, theta){
               exp(theta[1]*X[,"Exposure"] + theta[2]*X[,"Covariate"])
             }
                                     
#Creating a counterfactual. 
cft  <- function(X){
   Y               <- X
   Y[,"Exposure"]  <- 0.5*X[,"Exposure"]
   Y[,"Covariate"] <- 1.1*X[,"Covariate"] + 1
   return(Y)
}
pif(X, thetahat, rr_better, cft) 

#Same multivariate example for approximate method calculating 
#mean and variance
Xmean <- data.frame(Exposure = mean(X$Exposure), 
                   Covariate = mean(X$Covariate))
Xvar  <- var(X)
pif(Xmean, thetahat, rr_better, method = "approximate", Xvar = Xvar)

## Not run: 
#The one with $ operators doesn't work:
pif(Xmean, thetahat, rr_not, method = "approximate", Xvar = Xvar)

## End(Not run)
## Not run: 
#Warning: Multivariate cases cannot be evaluated with kernel method
pif(X, thetahat, rr_better, method = "kernel") 

## End(Not run)

#Example 4: Categorical Relative Risk & Exposure
#--------------------------------------------
set.seed(18427)
mysample  <- sample(c("Normal","Overweight","Obese"), 100, 
                   replace = TRUE, prob = c(0.4, 0.1, 0.5))
X        <- data.frame(Exposure = mysample)

thetahat <- c(1, 1.2, 1.5)

#Categorical relative risk function
rr <- function(X, theta){

   #Create return vector with default risk of 1
   r_risk <- rep(1, nrow(X))
   
   #Assign categorical relative risk
   r_risk[which(X[,"Exposure"] == "Normal")]      <- thetahat[1]
   r_risk[which(X[,"Exposure"] == "Overweight")]  <- thetahat[2]
   r_risk[which(X[,"Exposure"] == "Obese")]       <- thetahat[3]
   
   return(r_risk)
}

pif(X, thetahat, rr, check_rr = FALSE)

#Counterfactual of reducing all obesity to normality
cft <- function(X){
   X[which(X[,"Exposure"] == "Obese"),] <- "Normal"
   return(X)
}

pif(X, thetahat, rr, cft, check_rr = FALSE)

#Example 5: Categorical Relative Risk & continuous exposure
#----------------------------------------------------------
set.seed(18427)
BMI      <- data.frame(Exposure = rlnorm(100, 3.1, sdlog = 0.1))

#Theoretical minimum risk exposure is at 20kg/m^2 in borderline "Normal" category
BMI_adjusted <- BMI - 20

thetahat <- c(Malnourished = 2.2, Normal = 1, Overweight = 1.8, 
              Obese = 2.5)
              
rr       <- function(X, theta){
     
     #Create return vector with default risk of 1
     r_risk <- rep(1, nrow(X))
   
     #Assign categorical relative risk
     r_risk[which(X[,"Exposure"] < 0)]             <- theta[1] #Malnourished
     r_risk[intersect(which(X[,"Exposure"] >= 0), 
                      which(X[,"Exposure"] < 5))]  <- theta[2] #Normal
     r_risk[intersect(which(X[,"Exposure"] >= 5), 
                      which(X[,"Exposure"] < 10))] <- theta[3] #Overweight
     r_risk[which(X[,"Exposure"] >= 10)]           <- theta[4] #Obese
   
   return(r_risk)
}

#Counterfactual of everyone in normal range
cft <- function(bmi){
     bmi           <- data.frame(rep(2.5, nrow(bmi)), ncol = 1)
     colnames(bmi) <- c("Exposure")
     return(bmi)
}

pif(BMI_adjusted, thetahat, rr, cft, 
    check_exposure = FALSE, method = "empirical")


#Example 6: Bivariate exposure and rr ("classical PAF")
#------------------------------------------------------------------
set.seed(18427)
mysample  <- sample(c("Exposed","Unexposed"), 1000, 
                replace = TRUE, prob = c(0.1, 0.9))
X         <- data.frame(Exposure = mysample)
theta     <- c("Exposed" = 2.5, "Unexposed" = 1.2)   
rr        <- function(X, theta){
   
   #Create relative risk function
   r_risk <- rep(1, nrow(X))
   
   #Assign values of relative risk
   r_risk[which(X[,"Exposure"] == "Unexposed")] <- theta["Unexposed"]
   r_risk[which(X[,"Exposure"] == "Exposed")]   <- theta["Exposed"]
   
   return(r_risk)
}      

#Counterfactual of reducing the exposure in half of the individuals
cft <- function(X){

   #Find out which ones are exposed
   Xexp  <- which(X[,"Exposure"] == "Exposed")
   
   #Use only half of the exposed randomly
   reduc <- sample(Xexp, length(Xexp)/2)
   
   #Unexpose those individuals
   X[reduc, "Exposure"] <- "Unexposed"
   
   return(X)
}

pif(X, theta, rr, cft)

#Example 7: Continuous exposure, several covariates
#------------------------------------------------------------------
X <- data.frame(Exposure = rbeta(100, 2, 3),
                Age      = runif(100, 20, 100),
                Sex      = sample(c("M","F"), 100, replace = TRUE),
                BMI      = rlnorm(100, 3.2, 0.2))
thetahat <- c(-0.1, 0.05, 0.2, -0.4, 0.3, 0.1)

rr <- function(X, theta){
     #Create risk vector
     Risk    <- rep(1, nrow(X))
     
     #Identify subpopulations
     males   <- which(X[,"Sex"] == "M")
     females <- which(X[,"Sex"] == "F")
     
     #Calculate population specific rr
     Risk[males] <- theta[1]*X[males,"Exposure"] + 
                                      theta[2]*X[males,"Age"]^2 + 
                                      theta[3]*X[males,"BMI"]/2 
                                     
     Risk[females] <- theta[4]*X[females,"Exposure"] + 
                                      theta[5]*X[females,"Age"]^2 + 
                                      theta[6]*X[females,"BMI"]/2 
                                     
    return(Risk)
}

#Counterfactual of reducing BMI
cft <- function(X){
    excess_bmi           <- which(X[,"BMI"] > 25)
    X[excess_bmi,"BMI"]  <- 25
    return(X)
}

pif(X, thetahat, rr, cft)