ECICourse::upload_images_to_canvas()
knitr::opts_chunk$set(
  echo = TRUE,
  message = TRUE,
  collapse = TRUE
)

Read along with this tutorial, and run the R code in RStudio as you go (copy and paste, or type directly into R).

Start by loading packages:

library(ECICourse)
library(tidyverse)

Data

The data for this tutorial describe a fictional experiment conducted by a carbonated beverage company. The company wanted to understand how offering retailers a \$.50 discount on six-packs of soda affected retail prices. To clarify: when retailers were given a \$.50 discount on the wholesale price, some, but not all, would drop the retail price in response to this incentive.

The company suspected that the size of the discount was heterogeneous across retailers. Specifically, the company thought that retailers in more affluent areas might be less likely to lower the price. Thus, the soda company collected data on average incomes in the area served by each retailer (income), and used a block randomization scheme to assign the 3,782 retailers to treatment (D == 1) and control (D == 0) conditions. In the treatment condition, retailers received a $.50 discount on the wholesale price. In the control condition, no discount was offered.

The company then obtained average retail prices (avg_price) for its soda for all retailers in the experiment (these data were purchased from a third party). The data describe average prices, because over the course of a week, different customers may end up paying a different price for the same item (e.g., due to use or non-use of coupons and/or discount cards). The third party data were then linked back to the experimental data. The average retail price (avg_price) at each retailer is the outcome variable for the experiment.

To obtain the data, run the following code:

soda_prices <- get_ECI_data('module-5-tutorial')

Average treatment effect

First, we will note that the experiment revealed that average prices were lower among retailers receiving the discount.

library(coin)
independence_test(avg_price ~ D, data = soda_prices,
                  distribution = approximate())
soda_prices |>
  group_by(D) |>
  summarise(mean(avg_price))

The difference in average retail prices is rather substantial, with retail prices about about \$r abs(round(coef(lm(avg_price ~ D, data = soda_prices))['D'], 2)) lower at stores given a \$.50 discount.

The soda company thinks the treatment effects might vary among retailers, so we will explore that possibility next.

Heterogeneous treatment effects

Heterogeneity in treatment effects implies that the variance of the true (unobserved) treatment effect is greater than zero, and thus the variance of potential outcomes under treatment and control are unequal: Var(Y(1)) ≠ Var(Y(0)). We can visualize the variances of prices under treatment and control to see if there is a visible difference:

ggplot(soda_prices,
       aes(x = avg_price,
            y = factor(D, 
                       levels = c(0, 1), 
                       labels = c('Control', 'Treatment')))) +
  geom_jitter() +
  labs(y = NULL, x = 'Average price') +
  theme_bw()

This plot shows two things. First, it shows that average prices are somewhat lower in the treatment condition (as we just confirmed). Second, it shows that prices are more spread out in the treatment condition. The unequal variance across conditions suggests that treatment effect might be heterogeneous.

We can also calculate the variances directly:

soda_prices |>
  group_by(D) |>
  summarise(var(avg_price))

The variance of prices is higher in the treatment group, but it is not clear whether this is due to randomization and sampling error. To find out, we can run a permutation test:

t1 <- independence_test(avg_price ~ D, data = soda_prices, 
                        teststat = "quadratic", distribution = approximate())
t1

The argument teststat = "quadratic" in the call above means we want to test for a difference in variances of avg_price. An equivalent result can be obtained by calculating the squared differences in avg_price from mean(avg_price) ourselves, and then performing a permutation test for a difference in the average value of these squared residuals:

soda2 <- 
  soda_prices |>
  group_by(D) |>
  mutate(e2 = (avg_price - mean(avg_price))^2)
t2 <- independence_test(e2 ~ D, data = soda2, 
                        distribution = approximate())
t2

The p-values obtained from the two tests are the same:

pvalue(t1)
pvalue(t2)

Based on this result (that variances are not the same in the treatment and control conditions), it seems unlikely that the treatment effect---the difference in retail price attributable to the wholesale discount---is constant across retailers.

Conditional average treatment effects

The soda company suspected that the size of the discount might be related to the average income in the area served by the retailer. Because the company collected data on incomes, it is possible to calculate conditional average treatment effects (CATEs) for each income level (low, middle, high-middle, and high):

soda_prices |>
  group_by(income, D) |>
  summarise(avg_price = mean(avg_price)) |>
  pivot_wider(names_from = 'D', values_from = 'avg_price',
              names_prefix = 'D_') |>
  mutate(CATE = D_1 - D_0)

From this, it appears that the change in retail price was greatest among retailers who sell to customers with the lowest incomes, and smallest among retailers who sell to customers with the highest incomes.

An aside: We can also estimate these CATEs using linear regression:

fit <- lm(avg_price ~ income * D, data = soda_prices)
summary(fit)

Notice that in the regression output above, the coefficients named for the middle, high-middle, and high income do not match the CATEs we estimated above, and that there is no coefficient named after the low income group. The reason for this has to do with multicolinearity between the intercept and the discrete levels of income, and the way we specified the regression. In general, you need to be very careful when extracting treatment effects from a regression. We can recover the 4 CATEs using regression, but we need to combine some of the coefficient estimates:

CATE_low <- coef(fit)['D']
CATE_middle <- coef(fit)['D'] + coef(fit)['incomemiddle:D']
CATE_highmiddle <- coef(fit)['D'] + coef(fit)['incomehigh-middle:D']
CATE_high <- coef(fit)['D'] + coef(fit)['incomehigh:D']

tibble(income = c('low', 'middle', 'high-middle', 'high'),
       CATE = c(CATE_low, CATE_middle, CATE_highmiddle, CATE_high))

It is possible to rewrite the regression formula so that it does not include an intercept, and thus the coefficients map directly to each of the 4 CATEs we are interested in:

fit2 <- lm(avg_price ~ 0 + income + D:income, data = soda_prices)
summary(fit2)

It is my experience that people find it difficult to rewrite their regressions in this way. Moreover, this strategy may not always work (e.g. if you express heterogeneity in terms of multiple categorical variables). So in general, you should learn to extract treatment effects from regression output using the predict function and specifying the levels of data you care about:

distinct_soda_prices <-
  soda_prices |> 
  distinct(income, D)
distinct_soda_prices |> 
  mutate(Y_hat = predict(fit, newdata = distinct_soda_prices)) |> 
  pivot_wider(names_from = D, values_from = Y_hat, 
              names_prefix = 'Y_hat_') |> 
  mutate(CATE = Y_hat_1 - Y_hat_0)

Back to the estimates. Are these CATEs also heterogeneous across retailers within each income group? To find out, we can conduct a permutation test within each subset of retailers (stratified by income). To specify stratification by income for the permutation tests, we include the term | income in the formula:

t3 <- 
  independence_test(avg_price ~ D | income, data = soda_prices,
                   distribution = approximate(), teststat = "quadratic")
t3
pvalue(t3)

The test rejects the null of equal variances (for treatment and control) within each income bracket.

We can also compute the variances directly to get a sense of how treatment effect heterogeneity varies across income groups:

soda_prices |>
  group_by(income, D) |>
  summarise(variance = var(avg_price)) |>
  ungroup() |>
  pivot_wider(names_from = 'D', values_from = 'variance',
              names_prefix = 'variance_D_') |>
  mutate(difference_in_variance = variance_D_1 - variance_D_0)

Treatment effect heterogeneity is greater among retailers serving customers with the lowest incomes, and smaller among retailers serving customers with the highest incomes.



jasonmtroos/ECICourse documentation built on Sept. 8, 2023, 9:18 p.m.