library("marfx") library("dplyr") library("ggplot2") library("purrr")
Example from Kastner (2007) replicated in Berry et. al (2012).
$$ \begin{aligned}[t] \text{Trade} &= \beta_0 + \beta_C \text{Conflict} + \beta_B \text{(Trade Barriers)} \ & + \beta_{BC} (\text{(Trade Barriers)} \times \text{Conflict}) \ & + \beta Z + \varepsilon \end{aligned} $$
Alexseev (2006) ``Ballot-Box Vigilantism: Ethnic Population Shifts and Xenophobic Voting in Post-Soviet Russia.'' Political Behavior replicated in Berry et. al (2012).
Alexseev (2006) examines how changes in ethnic composition of Russian regions affects the vote share of the Russian nationalist Zhirinovsky Bloc in the 2003 elections of the Russian State Duma.
The model is, $$ \begin{aligned}[t] \text{(Xenophobic Voing)} &= \beta_0 + \beta_{S} \text{(Slavic Share)} \ &+ \beta_{N} \Delta \text{(non-Slavic Share)} \ &+ \beta_{SN} (\text{(Slavic Share)} \times \Delta \text{(non-Slavic Share)} \ &+ \beta Z + \varepsilon \end{aligned} $$ where $Z$ are control variables.
data("alexseev", package = "marfx") mod1 <- lm(xenovote ~ slavicshare * changenonslav + inc9903 + eduhi02 + unemp02 + apt9200 + vsall03 + brdcont, data = alexseev)
Note that even though the alexseev
dataset includes a variable
slavicshare_changenonslav
which is the interaction between Slavic share and $\Delta$ non-Slavic share, this model specifies the interaction in the formula. This makes it much simpler for analysis since the lm
object will contain information about the relationship between the variables and the terms in the regression, so in the analysis we will only need to specify changes in slavicshare
or changenonslav
and the functions can infer that the interaction will need to change. If instead, the varible slavicshare_changenonslav
was used, the lm
object would not know that the variable was an interaction between slavicshare
and changenonslav
and we would need to be careful to always change both the variable of interest and the interaction when considering any marginal or partial effects.
Berry et. al (2012, p. 14) present two marginal effect plots. The marginal effect of Slavic share on xenophobic voting, and the marginal effect of $\Delta$ non-Slavic share on xenophobic voting.
.data <- cross_d(list(changenonslav = seq(min(alexseev$changenonslav), max(alexseev$changenonslav), length.out = 30), slavicshare = mean(alexseev$slavicshare), inc9903 = mean(alexseev$inc9903), eduhi02 = mean(alexseev$eduhi02), unemp02 = mean(alexseev$unemp02), apt9200 = mean(alexseev$apt9200), vsall03 = mean(alexseev$vsall03), brdcont = FALSE)) .data <- cbind(.data, mfx(mod1, data = .data, "slavicshare")) ggplot() + geom_ribbon(data = .data, mapping = aes(x = changenonslav, ymin = conf.low, ymax = conf.high), alpha = 0.3) + geom_line(data = .data, mapping = aes(x = changenonslav, y = estimate)) + geom_rug(data = alexseev, mapping = aes(x = changenonslav)) + geom_hline(yintercept = 0, color = "gray") + scale_x_continuous(name = expression(paste(Delta, "non-Slavic Share")), breaks = seq(-2, 12, by = 2)) + scale_y_continuous(name = "Marginal Effect of Slavic Share", breaks = seq(-.05, .25, by = .05)) + theme_minimal() + theme(panel.grid = element_blank())
The marginal effects of $\Delta$ non-Slavic Share on Xenophobic voting:
.data <- cross_d(list(slavicshare = seq(min(alexseev$slavicshare), max(alexseev$slavicshare), length.out = 30), changenonslav = mean(alexseev$changenonslav), inc9903 = mean(alexseev$inc9903), eduhi02 = mean(alexseev$eduhi02), unemp02 = mean(alexseev$unemp02), apt9200 = mean(alexseev$apt9200), vsall03 = mean(alexseev$vsall03), brdcont = FALSE)) .data <- cbind(.data, mfx(mod1, data = .data, "changenonslav")) ggplot() + geom_ribbon(data = .data, mapping = aes(x = slavicshare, ymin = estimate - 1.96 * std.error, ymax = estimate + 1.96 * std.error), alpha = 0.3) + geom_line(data = .data, mapping = aes(x = slavicshare, y = estimate)) + geom_rug(data = alexseev, mapping = aes(x = slavicshare)) + geom_hline(yintercept = 0, color = "gray") + scale_x_continuous(name = "Slavic Share", breaks = seq(30, 100, by = 10)) + scale_y_continuous(name = expression(paste("Marginal Effect of", Delta, "non-Slavic Share")), breaks = seq(-1, .5, by = .25)) + theme_minimal() + theme(panel.grid = element_blank())
By using the std.error
, this plot has smoother confidence intervals than the one using conf.low
and conf.high
, which are generated from the quantiles of the simulations.
TODO: Add clustered standard errors.
To summarize the effect of Slavic Share and $\Delta$ non-Slavic Share on xenophobic vote share we can calculate the average marginal effects of these variables,
variables <- c("slavicshare" = "Slavic Share", "changenonslav" = "Change non-Slavic Share") .data <- list() for (i in seq_along(variables)) { .data[[i]] <- amfx(mod1, names(variables)[i], data = alexseev) %>% mutate(variable = names(variables)[i], description = variables[i]) } .data <- bind_rows(.data) ggplot(.data, aes(x = description, y = estimate, ymin = conf.low, ymax = conf.high)) + geom_hline(yintercept = 0, colour = "gray") + geom_pointrange() + coord_flip() + scale_x_discrete("") + scale_y_continuous("Average Marginal Effects") + theme_minimal() + theme(panel.grid = element_blank())
TODO What to do if interactions are specified in the data rather than the formula.
A logit model of voting turnout.
library("Zelig") data("turnout", package = "Zelig") mod_turnout <- glm(vote ~ poly(age, 2) + race + educate, data = turnout, family = binomial) summary(mod_turnout)
Calculate the average marginal effect of age,
amfx(mod_turnout, "age", data = turnout)
the average marginal effect of all variables,
data.frame(variable = c("age", "race", "educate")) %>% group_by(variable) %>% do({ amfx(mod_turnout, .$variable, data = turnout) })
Calculate the average marginal effects of age at its mean value, holding the other values at their means and race = "white",
summarize(turnout, age = mean(age), educate = mean(educate)) %>% mutate(race = "white") %>% mfx(mod_turnout, "age", data = .)
Calculate the marginal effects of age at each decade, 20 to 80, holding educate
at its mean,
and setting race
to "white":
data_frame(age = seq(20, 80, by = 10)) %>% mutate(race = "white", educate = mean(turnout$educate)) %>% bind_cols(., mfx(mod_turnout, "age", data = .)) %>% ggplot(aes(x = age, y = estimate, ymin = estimate - 2 * std.error, ymax = estimate + 2 * std.error)) + geom_ribbon(alpha = 0.3) + geom_line()
Calculate the average marginal effects of each variable holding the others at representative values within each group
mfx(mod_turnout, "age", data = summarize(group_by(turnout, race), age = mean(age), educate = mean(educate)))
Average finite difference effect of moving from 20 to 80,
afdfx(mod_turnout, data1 = mutate(turnout, age = 20), data2 = mutate(turnout, age = 80))
afdfx(mod_turnout, data1 = mutate(turnout, race = "white"), data2 = mutate(turnout, race = "other"))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.