A collection of tools for working with survey data from the British Election Study (BES) for statistical researchers. This package is principally designed for use by researchers in the House of Commons Library but may be useful to anyone using R for routine data analysis.
This package provides functions for generating turnout estimates from BES survey data by demographic characteristics. It is designed to make it easier for researchers, who may not frequently use BES survey data, to generate consistent and reproducible turnout estimates from a range of BES datasets.
Detailed voting behaviour by demographic characteristic is not officially collected in UK elections. To understand how different demographics voted (or not), researchers are reliant on estimates produced from opinion polls and survey data. The BES is the most comprehensive and reliable survey series.
There are principally two types of survey produced by the BES after an election: a panel dataset (commonly known as the internet panel) and a cross-sectional dataset (commonly known as the face to face panel). The internet panel is produced much sooner after an election than the face to face panel, although the latter is more robust due to a series of voter registeration validation checks.
Because there is a tendency for survey respondents to over-report whether they voted, total turnout estimates from the internet panel are often much higher than the actual known turnout result for an election. To compensate, a series of adjustments are needed to be applied to ensure estimated total turnout equals the known turnout result. The clbes
package aims to provide a consistent and reproducible way to generate turnout estimates for both the the internet and and face to face datasets.
The turnout_query
function can be used for either the BES internet or face to face datasets. There are four compulsory arguments to define, and five non-compulsory but when defined in varying combinations, can provide unweighted/weighted sample size and turnout estimates.
The first of the compulsory arguments is data
, and takes the name of the BES dataset you have loaded into R (it is recommended that the BES SPSS file versions are used in conjunction with the haven
package). The second is dgraphic
and is the name of the demographic characteristic variable of interest in the BES dataset. vote
takes the name of the variable which captures whether the BES survey respondent said they voted in the election of interest. The fourth compulsory argument is wt
which takes the name of the weighting variable to be used.
The codebooks accompanying BES datasets should be read for descriptions on variables available, and in particular, which weights should be used. There are different weights for the internet and face to face datasets.
If you are using a BES dataset where voter validation checks have not been carried out (typically the BES internet datasets) then the argument validated
should be returned as FALSE, and the argument result
must take a floating point numeric between 0 and 1 indicating the known turnout percentage for the whole of Great Britain in the election of interest - e.g. 0.688 to indicate a 68.8% turnout rate.
When validated
is FALSE and result
is provided a floating point numeric, the function filters your loaded BES dataset to show only survey respondents who are over the age of 18 and say they are registered to vote. Additionally, any "Don't Know" survey responses are removed. The turnout estimates generated by the weighted sample are then adjusted by result
so that the total estimated turnout equals result
. This goes someway of accounting for the over-reporting of turnout by survey respondents, although it assumes the rate of over-reporting is equal across survey respondents.
If your dataset has been validated (usually the face to face datasets) and you are using the appropriate validation weights as specified in the BES codebooks, then validated
and result
should be left as their defaults.
If you want to show the sample size (unweighted or weighted) for the generated turnout estimates, then the argument percent
must be FALSE (this stops the function from calculating percentages), and sample
either FALSE (its default) or TRUE.
When sample
and percent
are FALSE the weighted sample size is returned. If sample
is TRUE and percent
is FALSE then the unweighted sample size is returned. If pecent
is not set to FALSE then sample sizes will not be shown.
To show sample sizes for unvalidated data (validated
is FALSE), then result
should be set to its default, 0. This is to stop all sample sizes being multiplied by result
.
By default, all turnout estimates and sample sizes are returned as tibbles in the console. However, if write
is TRUE then results are returned as a CSV named "turnout_dgraphic
.csv" to the working directory.
The turnout_common
function is a simpler version of turnout_query
, although it is written to work specifically with the latest BES 2017 face to face survey only. The principle behind turnout_common
is to provide researchers turnout estimates for commonly queried demographics from the latest BES face to face survey. There is no need to supply a BES dataset to the function as this is done automatically whenever the function is called.
The argument dgraphic
is much the same as in turnout_query
although this must take one of five strings between quotation marks: age, gender, ethnicity, religion and education. Each string relates to a demographic characteristic within the BES face to face survey and automatically recodes and condenses variables where appropriate to create larger sample sizes per characteristic (e.g. ethnicity recodes the y11 variable from its 18 possible values to four).
The sample
, percent
and write
arguments behave in the same way as in turnout_query
. The only difference being is that when write
is true a CSV named "BES_2017_FTF_turnout_dgraphic
.csv" is returned.
Install from GitHub using devtools:
install.packages("devtools")
devtools::install_github("yespmedleon/clbes")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.