knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "README-"
)

Hacking intervals

"Could a different reasonable researcher analyzing the same data come to a different conclusion?" This is a question that gets to the heart of whether or not a scientific result can be trusted, yet it's a question that traditional statistical inference has little to say about.

We introduce the hacking interval, which is the range of a numerical scientific result that could be obtained over a set of reasonable dataset and hyperparameter manipulations. Hacking intervals come in two varieties:

This package computes tethered and prescriptively-constrained hacking intervals for linear models. We also compute an interval that considers both types of hacking at once; in other words, it is the range result that could be achieved by a model that fits the data almost as well as any model obtainable via the prescriptive constraints. At most one of the manipulations in the prescriptive constraints is permitted at a time.

Installation

devtools::install_github("beauCoker/hacking")

Quick demo

Start with a dataset. For this demo we'll generate a toy dataset data.

set.seed(0)

N = 50 # Number of observations
data <- data.frame(
  y = rnorm(N), # Response variable (continuous)
  w = rbinom(N, 1, .5), # Treatment variable (binary)
  X = matrix(rnorm(N*3), nrow=N), # Covariates included in base model
  Z = matrix(rnorm(N*3), nrow=N) # Covariates excluded from base model
)

Next, fit a linear model with lm. We'll call this the "base" model.

mdl <- lm(y ~ w + X.1*X.2, data=data)
(beta_0 <- mdl$coefficients['w'])

So, the ordinary least squares estimate for the coefficient beta_0 on the treatment variable w in the base model is about r trunc(beta_0*100)/100. A standard question in statistics is to ask, "what could happen if I estimated beta_0 using a different dataset drawn from the same distribution?" This is conceptually what a standard confidence interval tells you. It can be computed with R's built-in confint function:

(ci <- confint(mdl)['w',])

Now we get to the hacking interval part. What if instead you ask, "what if the scientist that reported this estimate threw out some important observations, or messed with the data in some other way? What's the range of estimates that could have been reported?" This is conceptually what a hacking interval tells you. For linear models, it can be computed with the hackint_lm function in our package. The parameter theta tells you what percentage of loss is tolerated for the tethered variety of hacking.

library(hacking)
output <- hackint_lm(mdl, data, theta=0.1, treatment = 'w')

In the output above, LB and UB stand for lower bound and upper bound. It says that a tethered hacking interval around the base model is (r trunc(output$tethered*100)/100), a prescriptively constrained hacking interval around the base model is (r trunc(output$constrained*100)/100), and a hacking interval that considers both types hacking is (r trunc(output$tethered_and_constrained*100)/100). Notice either of the tethered intervals are wider than the standard confidence interval, (r trunc(confint(mdl)['w',]*100)/100), but note that hacking intervals and standard confidence invervals measure different forms of uncertainty. Either could be larger, and hacking intervals needn't even be centered on the point estimate.

hackint_lm works by enumerating all of the manipulations within the prescriptive constraints and, for each manipulation, computing the ordinary least squares coefficient estimate as well as a tethered hacking interval around this estimate (i.e., where the model under the manipulation is essentially treated as a new base model). This complete list is available as a dataframe, with Estimate denoting the coefficient estimate and (LB,UB) denoting the tethered hacking interval. The prescriptively-constrained hacking interval is the range of Estimate and the type that considers prescriptive constraints and tethering is given by the mininum of LB and the maximum of UB. This list is useful for diagosing which manipulations are most impactful. The output is sorted by the largest absolute difference largest_diff of any value (LB, Estimate, or UB) from beta_0:

output$hacks_all

Other functionality

Focusing on most influential observations

The optional argument frac_remove_obs (default value 1) specifies the fraction of observations that are considered for removal in evaluating prescriptively-constrained hacking intervals. If frac_remove_obs is less than 1, then only observations with the highest Cook's distance are considered for removal. This will speed up computation for small datasets but does not provide any theoretical guarantees of accuracy.



beauCoker/hacking documentation built on Nov. 4, 2019, 7:08 a.m.