README.md

parlygroups

parlygroups offers a suite of functions for web-scraping data contained within the House of Commons' "Register of All-Party Parliamentary Groups" (APPGs).

Overview

According to Parliament an "All-Party Parliamentary Group (APPG) consists of Members of both Houses who join together to pursue a particular topic or interest". The register of APPGs "contains the financial and other information about Groups which the House has decided should be published". The register is published both as a web page and a PDF document. Links to all registers are available here.

Usage

The package provides one primary function to web-scrape and download the data contained in each APPG web page: download_appg(). The downloaded data is stored in a cache within a hidden environment that is retrieved locally by the secondary functions prefixed appg_*.

Users should call download_appg() first, by supplying the date of the register of interest as a character string in IS0 8601 format ("YYYY-MM-DD") to the register_date argument, before calling any secondary function. As the downloaded data is cached temporarily, if your R session is re-started the cache will be lost and you will need to re-download.

parlygroups::download_appg(register_date = "2019-11-05")

In addition to the primary register_date argument of download_appg() there are two further arguments available: pause and save.

pause is a non-negative numeric indicating length of time in seconds to pause between accessing and scraping APPG web pages. It is set to 1 second by default. A longer pause helps minimise burden on the Parliamentary website - see the ONS web-scraping policy for further best practice guidance on web-scraping.

save is a boolean which if TRUE saves the cached data as a series of .csv/.RDS files in your current working directory. This is intended to be a act as a quality of life argument allowing users to avoid re-downloading the same APPG register each time in a new session. The .csv files created are the base tibbles generated by the five secondary appgs_* functions, and the sole .RDS file contains the raw scraped data as a list. By default 'save' is FALSE.

parlygroups::download_appg(register_date = "2019-11-05", pause = 0, save = TRUE)

The secondary functions are all prefixed appgs_* and return tibbles based on the main five types of entry in the Register of All-Party Parliamentary Groups: group details, members, finance received, benefits in kind received, and details on the latest Annual General Meeting. By default all arguments within the functions are left as NA (i.e. blank) meaning that the complete set of data will be retrieved. If parameters are supplied to the function arguments a filtered set of data will be retrieved. See the package help documentation for each function to see a full list of available arguments.

parlygroups::appg_groups()

Retrieves basic details on the names of APPGs, their purpose and category type (whether they are country, subject, or club focused). Each row is one APPG.

parlygroups::appg_officers()

Retrieves details on the names of MPs and Lords which are officers of APPGs along with their role and party affiliation. Each row is one APPG officer.

parlygroups::appg_financial()

Retrieves details on the source of financial funding, the value of funding, and the date it was received/registered. Each row is one APPG fiancial funding record.

parlygroups::appg_benefits()

Retrieves details on the source of benefits in kind received by an APPG, the value range of the benefit (split into lower and higher values), and the date the benefit was received/registered. Each row is one APPG benefit in kind record.

parlygroups::appg_agm()

Retrieved details on the latest Annual General Meeting (AGM) held by the APPG, whether a statement of income and expenditure was issued, the date of the latest AGM, and the reporting year. Each row is one APPG AGM record.

Example

Testing

The package does not yet have unit tests, but the functions appear to work as intended. You should satisfy yourself that the functions behave in the way that you expect if you wish to use this package for research purposes.

Installation

Install from GitHub using devtools.

install.packages("devtools")
devtools::install_github("dempseynoel/parlygroups")


dempseynoel/parlygroups documentation built on Sept. 9, 2020, 8:07 a.m.