Build
Status Build
status codecov.io GitHub
tag

# Note to compile this file to README.mb, run the following:
# rmarkdown::render('README.Rmd',output_format = 'md_document')

knitr::opts_chunk$set( echo = TRUE, warning = FALSE, message = FALSE, error =
FALSE )

eesectors

Functions for producing the Economic Estimates for DCMS Sectors

First Statistical Release

This is a prototype and subject to constant development

This package provides functions used in the creation of a Reproducible Analytical Pipeline (RAP) for the Economic Estimates for DCMS sectors publication.

See the eesectorsmarkdown repository for an example of implementing these functions in the context of a Statistical First Release (SFR).

Installation

The package can then be installed using devtools::install_github('DCMSstats/eesectors'). Some users may not be able to use the devtools::install_github() commands as a result of network security settings. If this is the case, eesectors can be installed by downloading the zip of the repository and installing the package locally using devtools::install_local(<path to zip file>).

Quick start

This package provides functions to recreate Chapter three -- Gross Value Added (GVA) of the Economic estimates of DCMS Sectors.

Extracting data from underlying spreadsheets

The data are provided to DCMS as spreadsheets provided by the Office for National Statistics (ONS). Hence, the first set of functions in the package are designed to extract the data from these spreadsheets, and combine the data into a single dataset, ready to be checked, and converted into tables and figures.

There are four extract_ functions:


Note: that with the exception of extract_DCMS_sectors, the data extracted by these functions is potentially disclosive, and should therefore be handled with care and considered to be OFFICIAL-SENSITIVE. Steps must be taken to prevent the accidental disclosure of these data.

These should include (but not be limited to):


The extract functions will return a data.frame, and can be called as follows (see individual function documentation for more information about each of the arguments).

# path to spreadsheet containing the underlying data 

input <- 'working_file_YYYY.xlsm' 

# eesectors has a built in example spreadsheet
input <- example_working_file("example_working_file.xlsx")

extract_ABS_data(input) 

The various datasets used in the GVA chapter can be combined with the combine_GVA() function, which will return a data.frame of the combined data.

combine_GVA(
  ABS = extract_ABS_data(input),
  GVA = extract_GVA_data(input),
  SIC91 = extract_SIC91_data(input),
  tourism = extract_tourism_data(input)
)

Automated checking

The GVA chapter is built around the year_sector_data class. To create a year_sector_data object, a data.frame must be passed to it which contains all the data required to produce the tables and charts in Chapter three.

An example of how this dataset will need to look is bundled with the package: GVA_by_sector_2016. These data were extracted directly from the 2016 SFR which is in the public domain, and provide a test case for evaluating the data.

library(eesectors) 
GVA_by_sector_2016 

When an object is instantiated into the year_sector_data class, a number of checks are run on the data passed as the first argument. These are explained in more detail in the help ?year_sector_data()

gva <- year_sector_data(GVA_by_sector_2016) 

Any failed checks are raised as warnings, not errors, and so the user is able to continue. However it is also possible to log these warnings as github issues by setting log_issues=TRUE. This is a prototype feature that needs additional work to increase the usefulness of these issues, see below for details on environmental variables that are required for this functionality to work.

Creating tables and charts

Tables and charts for Chapter three can be reproduced simply by running the relevant functions:

year_sector_table(gva) 
figure3.1(gva) 

Note that figures produced remain ggplot2 objects, and can therefore be edited in the following way:

p <- figure3.2(gva)

p 

Titles, and other layers can then be added simply:

library(ggplot2)

p + ggtitle('Figure 3.2: Indexed growth in GVA (2010 =100)\n in DCMS sectors and UK: 2010-2015') 

Note that figures make use of the govstyle package. See the vignette for more information on how to use this package.



DCMSstats/eesectors documentation built on May 3, 2019, 2:43 p.m.