This vignette demonstrates estimation of public attitudes toward abortion from
responses to a single survey item, using the dynamic multi-level regression and
post-stratification (MRP) model implemented in dgmrp()
.
knitr::opts_chunk$set( echo = TRUE, collapse = TRUE, cache = TRUE, comment = "#>") library(dgo)
shape()
prepares input data for use with the modeling functions dgirt()
and
dgmrp()
. Here we use the included opinion
dataset.
dgirt_in_abortion <- shape(opinion, item_names = "abortion", time_name = "year", geo_name = "state", group_names = "race3", geo_filter = c("CA", "GA", "LA", "MA"), id_vars = "source")
In this call to shape()
we specified:
abortion
);year
), since dgo models are dynamic;state
and race3
),
because dgo models are group-level.Notice that we named only one of these variables defining respondent groups
using the group_names
argument. The geo_name
argument always takes the
variable giving respondents' local geographic area; it will be modeled
differently.
Using the argument geo_filter
, we subset the input data to the given values of
the geo_name
variable. And with the id_vars
argument, we named an identfier
that we'd like to keep in the processed data. (Other unused variables will be
dropped.)
summary()
gives a high-level description of the result.
summary(dgirt_in_abortion)
get_n()
and get_item_n()
give response counts.
get_n(dgirt_in_abortion, by = "state") get_item_n(dgirt_in_abortion, by = "year")
dgmrp()
fits a dynamic multi-level regression and post-stratification (MRP)
model to data processed by shape()
. Here, we'll use it to estimate public
attitudes toward abortion over time, for the groups defined by state
and
race3
. (Specifically, by their Cartesian product.)
Under the hood, dgmrp()
uses RStan for MCMC sampling, and arguments can be
passed to RStan's
stan()
via the ...
argument of dgmrp()
. This is almost always desirable. Here, we
specify the number of sampler iterations, chains, and cores.
dgmrp_out_abortion <- dgmrp(dgirt_in_abortion, iter = 1500, chains = 4, cores = 4, seed = 42)
The model results are held in a dgmrp_fit
object. Methods from RStan like
extract()
are available if needed because dgmrp_fit
is a subclass of
stanfit
. But dgo provides its own methods for typical post-estimation tasks.
For a high-level summary of the result, use summary()
.
summary(dgmrp_out_abortion)
To apply scalar functions to posterior samples, use
summarize()
. The
default output gives summary statistics for the model's theta_bar
parameters,
which represent group means. These are indexed by time (year
) and group, where
groups are again defined by local geographic area (state
) and any other
respondent characteristics (race3
).
head(summarize(dgmrp_out_abortion))
Alternatively,
summarize()
can apply
arbitrary functions to posterior samples for whatever parameter is given by its
pars
argument.
summarize(dgmrp_out_abortion, pars = "xi", funs = "var")
To access posterior samples in tabular form use
as.data.frame()
. By
default, this method returns post-warmup samples for the theta_bar
parameters,
but like other methods takes a pars
argument.
head(as.data.frame(dgmrp_out_abortion))
To poststratify the results use poststratify()
. Here, we use the group
population proportions bundled as annual_state_race_targets
to reweight and
aggregate estimates to strata defined by state-years.
poststratify(dgmrp_out_abortion, annual_state_race_targets, strata_names = c("state", "year"), aggregated_names = "race3")
To plot the results use dgirt_plot()
. This method plots summaries of posterior
samples by time period. By default, it shows a 95% credible interval around
posterior medians for the theta_bar
parameters, for each local geographic
area. Here we omit the CIs.
dgirt_plot(dgmrp_out_abortion, y_min = NULL, y_max = NULL)
dgirt_plot()
can also plot the data.frame
output from poststratify()
,
given arguments that identify the relevant variables. Below, we aggregate over
the demographic grouping variable race3
, resulting in a data.frame
of
estimates by state-year.
ps <- poststratify(dgmrp_out_abortion, annual_state_race_targets, strata_names = c("state", "year"), aggregated_names = "race3") head(ps) dgirt_plot(ps, group_names = NULL, time_name = "year", geo_name = "state")
In the call to dgirt_plot()
, we passed the names of the state
and year
variables. The group_names
argument was then NULL
, because there were no
grouping variables left after we aggregated over race3
.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.