extract_SIC91_data: extract SIC 91 Sales Data from ONS working file spreadsheet

Description Usage Arguments Details Value Examples

View source: R/extract_SIC91_data.R

Description

The data which underlies the Economic Sectors for DCMS sectors data is typically provided to DCMS as a spreadsheet from the Office for National Statistics. This function extracts the SIC Sales Data from that spreadsheet, and saves it to .Rds format. These data are used in place of the usual GVA values. An explanation of why can be found in the methodology note that accompanies the statistical first release (https://www.gov.uk/government/publications/dcms-sectors-economic-estimates-methodology).

IT IS HIGHLY ADVISEABLE TO ENSURE THAT THE DATA WHICH ARE CREATED BY THIS FUNCTION ARE NOT STORED IN A FOLDER WHICH IS A GITHUB REPOSITORY TO MITIGATE AGAINST ACCIDENTAL COMMITTING OF OFFICIAL DATA TO GITHUB. TOOLS TO FURTHER HELP MITIGATE THIS RISK ARE AVAILABLE AT https://github.com/ukgovdatascience/dotfiles.

Usage

1
2
extract_SIC91_data(x, sheet_name = "SIC 91 Sales Data", col_names = c("SIC",
  "description", "year", "ABS", "blank", "code"), ...)

Arguments

x

Location of the input spreadsheet file. Named something like "working_file_dcms_VXX.xlsm".

sheet_name

The name of the spreadsheet in which the data are stored. Defaults to New ABS Data.

col_names

character vector used to rename the column names from the imported spreadsheet. Defaults to c('year','ABS','total','perc','overlap').

...

additional arguments to be passed to readxl::read_excel.

Details

The best way to understand what happens when you run this function is to look at the source code, which is available at https://github.com/ukgovdatascience/eesectors/blob/master/R/. The code is relatively transparent and well documented. A brief explanation of what the function does here:

1. The function calls readxl::read_excel to load the appropriate page from the underlying spreadsheet.

2. Columns of interest are subset using x[, c('SIC', 'year', 'ABS')]

3. Empty rows (containing all NAs) are removed.

4. The data are saved out to an R serialisation object OFFICIAL_SIC91.Rds in the specified folder.

Value

The function returns nothing, but saves the extracted dataset to file.path(output_path, 'OFFICIAL_ABS.Rds'). This is an R data object, which retains the column types which would be lost if converted to a flat format like CSV.

Examples

1
2
3
4
5
6
7
8
## Not run: 
library(eesectors)
extract_toursim_data(
x = 'OFFICIAL_working_file_dcms_V13.xlsm',
sheet_name = 'Tourism'
)

## End(Not run)

ukgovdatascience/eesectors documentation built on Sept. 11, 2020, 12:19 p.m.