knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) # TO OUTPUT TO PDF, change to output: rmarkdown::pdf_document and open file (top left) Save As.
knitr::opts_chunk$set(collapse = T, comment = "#>") options(tibble.print_min = 4L, tibble.print_max = 4L) library(tidyverse) library(blscrapeR) library(reshape2) set.seed(1014) devtools::load_all('C:/Users/dsovich/Dropbox/Programming/Packages/blsr')
The Bureau of Labor Statistics (BLS) provides publicly available data on different aspects of the U.S. economy.
The blsr package makes getting this data fast and easy:
It provides simple functions for remotely retrieving data from the most commonly used BLS databases.
By constraining your options to a few key databases, it cleans the messy data that is extracted from the BLS's API.
It provides complete cross-functionality with the package blscrapeR so that users can customize their usage if need arises.
This document introduces you to blsr's basic set of download tools. Because blsr is a wrapper for the blscrapeR package (which is a wrapper for the BLS API), users should read the documentation for blscrapeR before proceeding.
You will want to acquire a free API key from the BLS. The API key increases your daily query limits and expands your data access. For more information on the API key, please see the documentation for blscrapeR.
The BLS provides data on several different aspects of the U.S. economy. A full list of their databases can be found here.
BLS data items are identified by objects known as series ID strings. In general, a series ID string is a combination of identifiers for:
The database that contains the data item, e.g. CE for the Current Employment Statistics database.
The industry and geographic area for which to summarize the data item.
The type of data item, e.g. the unemployment rate.
Unfortunately, the precise format of series ID strings varies by the underlying database (see here for a list). The blsr package resolves this problem with datbase-specific functions and metadata tools.
The Current Employment Statistics (CES) survey produces monthly estimates of employment, hours, and earnings.
CES estimates are available from blsr at the national and state levels across all industries. Estimates span from January 1939 to present day. Seasonally and non-seasonally adjusted are available.
National estimates for the CES belong to the CE database. To view the series ID string format for this database, examine the series_id_map
data.frame within the loaded ces_national_codes_list
list:
knitr::kable(ces_national_codes_list$series_id_map, caption = "Series ID string for CES national database")
The ces_national_codes_list
list provides possible values for each field that goes into the series ID. For example, the data_types
data.frame documents the available data items:
knitr::kable(ces_national_codes_list$data_types, caption = "Data items available for CES national database")
Note that wage and payroll data can only be run at a seasonally unadjusted level as well.
The ces_download
function is used to download and clean the data for the series IDs. Feed this function your chosen values for each field of the series ID string:
# Download data for total nonfarm payroll ces_df = ces_download(bls_key = Sys.getenv("BLS_KEY"), start_year = 2010, end_year = 2015, adjustment = "S", industries = "00000000", data_types = "01") # Inspect output head(ces_df %>% data.frame())
The download functions are capable of handling multiple parameter choices for the same fields. By default, it performs a Cartesian of the choices:
# Download data for multiple series ces_df = ces_download(bls_key = Sys.getenv("BLS_KEY"), start_year = 2013, end_year = 2015, adjustment = "S", industries = (ces_national_codes_list$indu_codes %>% filter(level == 2, private_sector_flag == 1))$industry_code, data_types = c("01", "03", "11"))
In some cases you may not want to Cartesian the argument values. In this case, just use the base bls_download
function and apply the clean_ces_national
function:
# Series IDs to download series_ids = c("CES0000000001", "CEU0000000001") # Download the data ces_df = bls_download(seriesid = series_ids, start_year = 2010, end_year = 2015, bls_key = Sys.getenv("BLS_KEY")) # Clean the data ces_df = ces_df %>% clean_ces_national()
Note that the BLS API does not allow you to download data on more than 50 series IDs in a single request. The CES download function will prohibit you from doing so.
State estimates for the CES belong to the SM database. The series ID string format is within the ces_state_codes_list
object:
knitr::kable(ces_state_codes_list$series_id_map, caption = "Series ID string for CES state database")
The same ces_download
function can be used for state data by populating the state
argument:
# Download data for total nonfarm payroll ces_df = ces_download(bls_key = Sys.getenv("BLS_KEY"), start_year = 2010, end_year = 2015, adjustment = "U", industries = "05000000", data_types = c("01", "03", "11"), states = "1900000") # Inspect output head(ces_df %>% data.frame())
Other functionality remains the same as national series. Note that certain data series are available at the national level may not available at the state level from the BLS.
The Job Openings and Labor Turnover Survey (JOLTS) produces monthly estimates of job openings, hires, quits, layoffs and discharges, and other separations.
JOLTS estimates are available from blsr at the national and regional level across all industries. The data is available via API from December 2000 to present day.
Use the jolts_codes_list
object to view the series ID string format for the JOLTS database:
knitr::kable(jolts_codes_list$series_id_map, caption = "Series ID string for JOLTS database")
For example, to retrive the monthly number of seasonally adjusted non-farm hires, we need to use the series ID JTS00000000HIL
.
Use the jolts_download
function to download national estimates for JOLTS:
# Download the data jolts_df = jolts_download(bls_key = Sys.getenv("BLS_KEY"), start_year = 2010, end_year = 2015, adjustment = "S", industries = "000000", data_types = "HI", data_levels = "L") # View the data head(jolts_df)
JOLTS produces data for the four geographic regions of the United States. Only the total nonfarm estimates (industry 000000) are available at the regional level.
To download regional data, populate the regions field of the jolts_download
function:
# Download the data jolts_df = jolts_download(bls_key = Sys.getenv("BLS_KEY"), start_year = 2010, end_year = 2015, adjustment = "S", industries = "000000", data_types = "HI", data_levels = "L", regions = c("MW", "NE", "SO", "WE"))
The Local Area Unemployment Statistics (LAUS) program produces monthly estimates of employment, unemployment, and labor force size.
LAUS estimates are available at the state level from blsr. The data is available from January 1976 to present day.
Use the laus_codes_list
object to view the series ID string format for LAUS:
knitr::kable(laus_codes_list$series_id_map, caption = "Series ID string for LAUS database")
For example, we can retrieve the seasonally adjusted unemployment rate for Alabama using the series ID LASST010000000000003
.
Use the laus_download
function to download state estimates for the LAUS:
# Download the data laus_df = laus_download(bls_key = Sys.getenv("BLS_KEY"), start_year = 2010, end_year = 2015, adjustment = "S", states = c("ST0100000000000", "ST0200000000000"), data_types = c("03")) # View the data head(laus_df)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.