knitr::opts_chunk$set(fig.path = "man/figures/README-")
Cite as:
This is a set of functions to facilitate running MRP models on CCES data and is a companion to ccesMRPprep
.
To install,
remotes::install_github("kuriwaki/ccesMRPrun")
The main functions in this package are:
fit_brms()
for fitting a multilevel model (or fit_brms_binomial
)poststrat_draws()
for extracting posterior draws for each areasumm_sims()
for obtaining summary statistics from these drawsscatter_45()
(in ccesMRPviz) for clearly visualizing the relationship between the truth and estimateSteps 1-3 can be done via mrp_onestep()
.
See below for a demonstration with an example in the state of Georgia.
library(ccesMRPrun) library(tidyverse) library(ccesMRPviz)
This is a simple wrapper around brms::brm
but with some custom priors and a binomial model as a default.
The two key parts of the workflow is a formula and a data. The formula should be a brms formula with a binary variable in the outcome. The data should be individual level data and have all the variables mentioned in the formula.
form <- response ~ (1|age) + (1 + female |educ) + clinton_vote + (1|cd) cc_voters <- filter(cces_GA, vv_turnout_gvm == "Voted")
Now fit the model. fit_brms
is basically the brm
function, but with some wrappers.
fit <- fit_brms(form, cc_voters, verbose = FALSE, .backend = "cmdstanr") class(fit)
The cmdstanr
is more lightweight than rstan
and takes advantage of all the latest improvements. However, you will need to install the package from Github (rather than CRAN) and run the following command once:
cmdstanr::check_cmdstan_toolchain() cmdstanr::install_cmdstan(cores = 2)
To avoid this, you can set .backend = "rstan"
if you have rstan installed and pre-loaded.
We can take predicted values from each of the MCMC draws, and aggregate it up to the area of interest.
Here we use the poststratification data to fit on. We use the acs_GA
built-in data here, but refer to ccesMRPprep
to make a data that is your own.
drw <- poststrat_draws(fit, poststrat_tgt = acs_GA) drw
We often care about the posterior mean and 95 percent credible intervals of the draws.
mrp_val <- summ_sims(drw)
Append the truth and a baseline raw-sample
dir_val <- direct_ests(form, cc_voters, area_var = "cd", weight_var = "weight_post") mrp_val <- summ_sims(drw) %>% left_join(elec_GA, by = "cd") %>% left_join(dir_val, by = "cd")
A wrapper for visualizing the accuracy relationship, from ccesMRPviz
.
scatter_45(mrp_val, clinton_vote, p_mrp_est, lblvar = cd, ubvar = p_mrp_900, lbvar = p_mrp_050, xlab = "Clinton Vote", ylab = "MRP Estimate ")
Compare this with raw estimates:
library(patchwork) gg_raw <- scatter_45(mrp_val, clinton_vote, p_raw, cd, xlab = "Clinton Vote", ylab = "Raw Average", xlim = c(0.19, 0.85), ylim = c(0.19, 0.85), expand_axes = FALSE) gg_ygw <- scatter_45(mrp_val, clinton_vote, p_wt, cd, xlab = "Clinton Vote", ylab = "Simple Weighted Average", xlim = c(0.19, 0.85), ylim = c(0.19, 0.85), expand_axes = FALSE) gg_raw + gg_ygw
It may be easier to store the models in long form and show them at once.
# reshape to long mrp_long <- mrp_val %>% select(cd, p_mrp_est, p_raw, p_wt, clinton_vote) %>% pivot_longer(-c(cd, clinton_vote), names_to = "model") # plot scatter_45(mrp_long, clinton_vote, value, by_form = ~model, by_labels = c(p_mrp_est = "MRP", p_raw = "Raw", p_wt = "YouGov Weighted"), xlab = "Clinton Vote", ylab = "Estimate")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.