This vignette demonstrates estimation of latent policy liberalism from
individuals' responses to five survey items, using the Bayesian group-level IRT
model implemented in dgirt().
knitr::opts_chunk$set( echo = TRUE, collapse = TRUE, cache = TRUE, comment = "#>") library(dgo)
shape() prepares input data for use with the modeling functions dgirt() and
dgmrp(). Here we use the included opinion dataset.
dgirt_in_liberalism <- shape(opinion, item_names = c("abortion", "affirmative_action","stemcell_research" , "gaymarriage_amendment", "partialbirth_abortion") , time_name = "year", geo_name = "state", group_names = "race3", geo_filter = c("CA", "GA", "LA", "MA"))
In this call to shape() we specified:
item_names;year), since dgo models are dynamic;state and race3),
because dgo models are group-level.Notice that we named only one of these variables defining respondent groups
using the group_names argument. The geo_name argument always takes the
variable giving respondents' local geographic area; it will be modeled
differently.
Using the argument geo_filter, we subset the input data to the given values of
the geo_name variable. And with the id_vars argument, we named an identfier
that we'd like to keep in the processed data. (Other unused variables will be
dropped.)
Important: the dgirt() model assumes consistent coding of the polarity of item
responses for identification. This is already true for the opinion data.
Typically it requires manual recoding.
summary() gives a high-level description of the result.
summary(dgirt_in_liberalism)
get_n() and get_item_n() give response counts.
get_n(dgirt_in_liberalism, by = "state") get_item_n(dgirt_in_liberalism, by = "year")
dgirt() estimates a latent variable based on responses to multiple survey
questions. Here, we'll use it to estimate latent policy liberalism over time,
for the groups defined by state and race3. (Specifically, by their Cartesian
product.)
Under the hood, dgirt() uses RStan for MCMC sampling, and arguments can be
passed to RStan's
stan()
via the ... argument of dgirt(). This is almost always desirable. Here, we
specify the number of sampler iterations, chains, and cores.
dgirt_out_liberalism <- dgirt(dgirt_in_liberalism, iter = 3000, chains = 4, cores = 4, seed = 42)
The model results are held in a dgirt_fit object. Methods from RStan like
extract() are available if needed because dgirt_fit is a subclass of
stanfit. But dgo provides its own methods for typical post-estimation tasks.
For a high-level summary of the result, use summary().
summary(dgirt_out_liberalism)
To apply scalar functions to posterior samples, use
summarize(). The
default output gives summary statistics for the model's theta_bar parameters,
which represent group means. These are indexed by time (year) and group, where
groups are again defined by local geographic area (state) and any other
respondent characteristics (race3).
head(summarize(dgirt_out_liberalism))
Alternatively,
summarize() can apply
arbitrary functions to posterior samples for whatever parameter is given by its
pars argument.
summarize(dgirt_out_liberalism, pars = "xi", funs = "var")
To access posterior samples in tabular form use
as.data.frame(). By
default, this method returns post-warmup samples for the theta_bar parameters,
but like other methods takes a pars argument.
head(as.data.frame(dgirt_out_liberalism))
To poststratify the results use poststratify(). Here, we use the group
population proportions bundled as annual_state_race_targets to reweight and
aggregate estimates to strata defined by state-years.
poststratify(dgirt_out_liberalism, annual_state_race_targets, strata_names = c("state", "year"), aggregated_names = "race3")
To plot the results use dgirt_plot(). This method plots summaries of posterior
samples by time period. By default, it shows a 95% credible interval around
posterior medians for the theta_bar parameters, for each local geographic
area. Here we omit the CIs.
dgirt_plot(dgirt_out_liberalism, y_min = NULL, y_max = NULL)
dgirt_plot() can also plot the data.frame output from poststratify(),
given arguments that identify the relevant variables. Below, we aggregate over
the demographic grouping variable race3, resulting in a data.frame of
estimates by state-year.
ps <- poststratify(dgirt_out_liberalism, annual_state_race_targets, strata_names = c("state", "year"), aggregated_names = "race3") head(ps) dgirt_plot(ps, group_names = NULL, time_name = "year", geo_name = "state")
In the call to dgirt_plot(), we passed the names of the state and year
variables. The group_names argument was then NULL, because there were no
grouping variables left after we aggregated over race3.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.