read_abs: Download, extract, and tidy ABS time series spreadsheets

View source: R/read_abs.R

read_absR Documentation

Download, extract, and tidy ABS time series spreadsheets

Description

[Stable]

read_abs() downloads ABS time series spreadsheets, then extracts the data from those spreadsheets, then tidies the data. The result is a single data frame (tibble) containing tidied data.

Usage

read_abs(
  cat_no = NULL,
  tables = "all",
  series_id = NULL,
  path = Sys.getenv("R_READABS_PATH", unset = tempdir()),
  metadata = TRUE,
  show_progress_bars = TRUE,
  retain_files = TRUE,
  check_local = TRUE,
  release_date = "latest"
)

read_abs_series(series_id, ...)

Arguments

cat_no

ABS catalogue number, as a string, including the extension. For example, "6202.0".

tables

numeric. Time series tables in ⁠cat_no`` to download and extract. Default is "all", which will read all time series in ⁠cat_no⁠. Specify ⁠tables⁠to download and import specific tables(s) - eg.⁠tables = 1ortables = c(1, 5)'.

series_id

(optional) character. Supply an ABS unique time series identifier (such as "A2325807L") to get only that series. This is an alternative to specifying cat_no.

path

Local directory in which downloaded ABS time series spreadsheets should be stored. By default, path takes the value set in the environment variable "R_READABS_PATH". If this variable is not set, any files downloaded by read_abs() will be stored in a temporary directory (tempdir()). See Details below for more information.

metadata

logical. If TRUE (the default), a tidy data frame including ABS metadata (series name, table name, etc.) is included in the output. If FALSE, metadata is dropped.

show_progress_bars

TRUE by default. If set to FALSE, progress bars will not be shown when ABS spreadsheets are downloading.

retain_files

when TRUE (the default), the spreadsheets downloaded from the ABS website will be saved in the directory specified with path. If set to FALSE, the files will be stored in a temporary directory.

check_local

If TRUE, the default, local fst files are used, if present.

release_date

Either "latest" or a string coercible to a date, such as "2022-02-01". If "latest", the latest release of the requested data will be returned. If a date, (eg. "2022-02-01") read_abs() will attempt to download the data from that month's release. See Details.

...

Arguments to read_abs_series() are passed to read_abs().

Details

read_abs_series() is a wrapper around read_abs(), with series_id as the first argument.

read_abs() downloads spreadsheet(s) from the ABS containing time series data. These files need to be saved somewhere on your disk. This local directory can be controlled using the path argument to read_abs(). If the path argument is not set, read_abs() will store the files in a directory set in the "R_READABS_PATH" environment variable. If this variable isn't set, files will be saved in a temporary directory.

To check the value of the "R_READABS_PATH" variable, run Sys.getenv("R_READABS_PATH"). You can set the value of this variable for a single session using Sys.setenv(R_READABS_PATH = <path>). If you would like to change this variable for all future R sessions, edit your .Renviron file and add R_READABS_PATH = <path> line. The easiest way to edit this file is using usethis::edit_r_environ().

The release_date argument allows you to download table(s) other than the latest release. This is useful for examining revisions to time series, or for obtaining the version of series that were available on a given date. Note that you cannot supply more than one date to release_date. Note also that any dates prior to mid-2019 (the exact date varies by series) will fail.

Value

A data frame (tibble) containing the tidied data from the ABS time series table(s).

Examples


# Download and tidy all time series spreadsheets
# from the Wage Price Index (6345.0)
## Not run: 
wpi <- read_abs("6345.0")

## End(Not run)

# Download table 1 from the Wage Price Index
## Not run: 
wpi_t1 <- read_abs("6345.0", tables = "1")

## End(Not run)

# Or table 1 as in the Sep 2019 release of the WPI:
## Not run: 
wpi_t1_sep2019 <- read_abs("6345.0", tables = "1", release_date = "2019-09-01")

## End(Not run)

# Or tables 1 and 2a from the WPI
## Not run: 
wpi_t1_t2a <- read_abs("6345.0", tables = c("1", "2a"))

## End(Not run)


# Get two specific time series, based on their time series IDs
## Not run: 
cpi <- read_abs(series_id = c("A2325806K", "A2325807L"))

## End(Not run)

# Get series IDs using the `read_abs_series()` wrapper function
## Not run: 
cpi <- read_abs_series(c("A2325806K", "A2325807L"))

## End(Not run)

MattCowgill/readabs documentation built on Feb. 2, 2024, 12:03 a.m.