extract_tourism_data: extract toursim Data ONS working file spreadsheet

Description Usage Arguments Details Value Examples

View source: R/extract_tourism_data.R

Description

The data which underlies the Economic Sectors for DCMS sectors data is typically provided to DCMS as a spreadsheet from the Office for National Statistics. This function extracts the tourism data from that spreadsheet, and saves it to .Rds format. These data are provided as the usual tourism values in the GVA dataset cannot be used.

IT IS HIGHLY ADVISEABLE TO ENSURE THAT THE DATA WHICH ARE CREATED BY THIS FUNCTION ARE NOT STORED IN A FOLDER WHICH IS A GITHUB REPOSITORY TO MITIGATE AGAINST ACCIDENTAL COMMITTING OF OFFICIAL DATA TO GITHUB. TOOLS TO FURTHER HELP MITIGATE THIS RISK ARE AVAILABLE AT https://github.com/ukgovdatascience/dotfiles.

Usage

1
2
extract_tourism_data(x, sheet_name = "Tourism", col_names = c("year", "GVA",
  "total", "perc", "overlap"), ...)

Arguments

x

Location of the input spreadsheet file. Named something like "working_file_dcms_VXX.xlsm".

sheet_name

The name of the spreadsheet in which the data are stored. Defaults to New ABS Data.

col_names

character vector used to rename the column names from the imported spreadsheet. Defaults to c('year','gva','total','perc','overlap').

...

additional arguments to be passed to readxl::read_excel.

Details

The best way to understand what happens when you run this function is to look at the source code, which is available at https://github.com/ukgovdatascience/eesectors/blob/master/R/. The code is relatively transparent and well documented. A brief explanation of what the function does here:

1. The function calls readxl::read_excel to load the appropriate page from the underlying spreadsheet.

2. Sanitise the colnames using a user-supplied vector in new_colnames. If there are no changes to the 2016 spreadhseet, in future years, then the default vector should work in future years. If there have been changes, this is likely to be a cause of errors.

3. Empty rows (containing all NAs) are removed.

4. The data are saved out to an R serialisation object OFFICIAL_tourism.Rds in the specified folder.

Value

The function returns nothing, but saves the extracted dataset to file.path(output_path, 'OFFICIAL_ABS.Rds'). This is an R data object, which retains the column types which would be lost if converted to a flat format like CSV.

Examples

1
2
3
4
5
6
7
8
## Not run: 
library(eesectors)
extract_toursim_data(
x = 'OFFICIAL_working_file_dcms_V13.xlsm',
sheet_name = 'Tourism'
)

## End(Not run)

ukgovdatascience/eesectors documentation built on Sept. 11, 2020, 12:19 p.m.