estimatr is for (fast) OLS and IV regression with robust standard errors. This document shows how
estimatr integrates with RStudio's
tidyverse suite of packages.
We use the Swiss Fertility and Socioeconomic Indicators data (available in R, description here) to show how
lm_robust works with
purrr. What is shown for
lm_robust here typically applies to all the other
estimatr functions (
The first step to the tidyverse is turning model output into data we can manipulate. The
tidy function converts an
lm_robust object into a data.frame.
library(estimatr) fit <- lm_robust(Fertility ~ Agriculture + Catholic, data = swiss) tidy(fit)
Once a regression fit is a data.frame, you can use any of the
dplyr "verbs" for data manipulation, like
arrange (more on this here).
library(tidyverse) # lm_robust and filter fit %>% tidy %>% filter(term == "Agriculture") # lm_robust and select fit %>% tidy %>% select(term, estimate, std.error) # lm_robust and mutate fit %>% tidy %>% mutate(t_stat = estimate/ std.error, significant = p.value <= 0.05)
ggplot2 offers a number of data visualization tools that are compatible with
fit %>% tidy %>% filter(term != "(Intercept)") %>% ggplot(aes(y = term, x = estimate)) + geom_vline(xintercept = 0, linetype = 2) + geom_point() + geom_errorbarh(aes(xmin = conf.low, xmax = conf.high, height = 0.1)) + theme_bw()
library(ggplot2) ggplot(swiss, aes(x = Agriculture, y = Fertility)) + geom_point() + geom_smooth(method = "lm_robust") + theme_bw()
Note that the functional form can include polynomials. For instance, if the model is $Fertility \sim Agriculture + Agriculture^2 + Agriculture^3$, we can model this in the following way:
library(ggplot2) ggplot(swiss, aes(x = Agriculture, y = Fertility)) + geom_point() + geom_smooth(method = "lm_robust", formula = y ~ poly(x, 3, raw = TRUE)) + theme_bw()
rsample pacakage provides tools for bootstrapping:
library(rsample) boot_out <- bootstraps(data = swiss, 500)$splits %>% map(~ lm_robust(Fertility ~ Catholic + Agriculture, data = analysis(.))) %>% map(tidy) %>% bind_rows(.id = "bootstrap_replicate") kable(head(boot_out))
boot_out is a data.frame that contains estimates from each boostrapped sample. We can then use
dplyr functions to summarize the bootstraps,
tidyr functions to reshape the estimates, and
GGally::ggpairs to visualize them.
boot_out %>% group_by(term) %>% summarise(boot_se = sd(estimate)) # To visualize the sampling distribution library(GGally) boot_out %>% select(bootstrap_replicate, term, estimate) %>% spread(key = term, value = estimate) %>% select(-bootstrap_replicate) %>% ggpairs(lower = list(continuous = wrap("points", alpha = 0.1))) + theme_bw()
purrr provides tools to perform the same operation on every element of a vector. For instance, we may want to estimate a model on different subsets of data. We can use the
map function to do this.
library(purrr) # Running the same model for highly educated and less educated cantons/districts two_subsets <- swiss %>% mutate(HighlyEducated = as.numeric(Education > 8)) %>% split(.$HighlyEducated) %>% map( ~ lm_robust(Fertility ~ Catholic, data = .)) %>% map(tidy) %>% bind_rows(.id = "HighlyEducated") kable(two_subsets, digits =2)
Alternatively, we might want to regress different dependent variables on the same independent variable.
map can be used alongwith
estimatr functions for this purpose as well.
three_outcomes <- c("Fertility", "Education", "Agriculture") %>% map(~ formula(paste0(., " ~ Catholic"))) %>% map(~ lm_robust(., data = swiss)) %>% map_df(tidy) kable(three_outcomes, digits =2)
ggplot2, we can make a coefficient plot:
three_outcomes %>% filter(term == "Catholic") %>% ggplot(aes(x = estimate, y = outcome)) + geom_vline(xintercept = 0, linetype = 2) + geom_point() + geom_errorbarh(aes(xmin = conf.low, xmax = conf.high, height = 0.1)) + ggtitle("Slopes with respect to `Catholic`") + theme_bw()
estimatr functions in the tidyverse is easy once the model outputs have been turned into data.frames. We accomplish this with the
tidy function. After that, so many summary and visualization possibilities open up. Happy tidying!
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.