library(knitr) opts_chunk$set( fig.align = "center", fig.width = 4, fig.height = 4, warning = FALSE, crop = TRUE)
library(tidyr) library(purrr) library(dplyr) library(coalitions)
In this vignette we demonstrate the offered functionality for pooling multiple surveys.
We offer convenience functions that allow for easily performing the pooling approach:
get_surveys
: Wrapper that uses scrape_wahlrecht
and collapse_parties
to download the most current survey results from https://www.wahlrecht.de/
and stores the prepared data inside a nested tibble
(see tidyr::nest
)
pool_surveys
: Pool all newest surveys (obtained with get_surveys
), using
a specified time window that defaults to the last 14 days, and assuming a certain
correlation between the number of party-specific votes of any two polling agencies,
which defaults to 0.5. Per polling agency only the newest survey in the time windows
is considered.
The three arguments last_date
, period
and period_extended
define the time window
used in pool_surveys
. Using these arguments one can choose between two types of pooling:
If period_extended
equals NA
: Surveys in the time window from last_date
to
last_date - period
will be considered for each polling agency.
If period_extended
does not equal NA
: Same as 1. Additionally however, surveys in the time
window from last_date - period
to last_date - period_extended
will also be considered
for each polling agency, but only after downweighting them by halving their
true sample size.
The latter option can be especially useful if opinion polls for a specific election
are only published very rarely. As default, pool_surveys
uses a time window starting
from the current date and going 14 days back, not making use of period_extended
.
# Scrape current surveys from the major polling agencies in Germany # surveys <- get_surveys() # As the web connection is sometimes a bit unstable we here use the sample data set of pre-scraped surveys surveys <- coalitions::surveys_sample surveys
# Obtain the pooled sample for today, based on the last 14 days last_date <- surveys %>% tidyr::unnest() %>% pull(date) %>% max() pool <- pool_surveys(surveys, last_date = last_date) pool %>% select(-start, -end)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.