load_data: Assemble a data frame of incident and cumulative cases,...

View source: R/load_data.R

load_dataR Documentation

Assemble a data frame of incident and cumulative cases, deaths or hospitalizations due to COVID-19 as they were available as of one or more past dates.

Description

Assemble a data frame of incident and cumulative cases, deaths or hospitalizations due to COVID-19 as they were available as of one or more past dates.

Usage

load_data(
  issues = NULL,
  as_of = NULL,
  location_code = NULL,
  spatial_resolution = "state",
  temporal_resolution = "weekly",
  measure = "deaths",
  geography = c("US", "global"),
  source = NULL,
  drop_last_date = FALSE
)

Arguments

issues

vector of issue dates (i.e. report dates) to use for querying data, either Date objects or strings in the format 'yyyy-mm-dd'. Data for the requested measures that were reported or updated exactly on the specified issue date(s) will be returned. If multiple issue dates are provided, the result includes the data for all such issue dates.

as_of

character vector of "as of" dates to use for querying truths in format 'yyyy-mm-dd'. For each spatial unit and temporal reporting unit, the last available data with an issue date on or before the given as_of date are returned.

location_code

character vector of location codes. Default to NULL. For US locations, this should be a list of FIPS code or 'US' For ECDC locations, this should be a list of location name abbreviation.

spatial_resolution

character vector specifying spatial unit types to include: one or more of 'county', 'state' and/or 'national'. Default to 'state'. Note that 'county' is not available for hospitalization data. When source is "covidcast", this parameter has to match with location_code, if specified.

temporal_resolution

string specifying temporal resolution to include: one of 'daily' or 'weekly'

measure

string specifying measure of disease dynamics: one of 'deaths', 'cases', 'hospitalizations', or 'flu hospitalizations'. The first three of these refer to measures of covid intensity. Default to 'deaths'.

geography

character, which data to read. Default is "US", other option is "global". Note that "global" is not available for hospitalization data and "covidcast" source.

source

string specifying data source. Currently supported sources are "jhu" or "covidcast" for the "deaths" or "cases" measures; "healthdata" or "covidcast" for the "hospitalizations" and "flu hospitalizations" measures. Default to NULL which means "healthdata" for hospitalization data and "jhu" for all other measures.

drop_last_date

boolean indicating whether to drop the last 1 day of data for the influenza and COVID hospitalization signals. The last day of data from the HHS data source is unreliable, so it is recommended to set this to TRUE. However, the default is FALSE so that the function maintains fidelity to the authoritative data source. This argument is ignored if the measure is 'deaths' or 'cases'.

Details

Data for a specified issue are only returned if the data were first available on that date, or were updated on that date. A warning is generated for any issue dates for which no data were available.

A query based on an as_of date returns the data for the most recent issue date that is on or before the specified as_of date. A warning is generated for any as_of dates for which no data were available; this only occurs if the as_of date is prior to any data release for the specified measure.

If the user provides values for both issue and as_of, a warning is generated and the argument for issue is ignored.

If multiple issue dates or as_of dates are provided, the result combines the data for all such dates. If no value is provided for either issue or as_of, results for the most recent available as_of date are returned.

Value

data frame with columns location (fips code), date, inc, cum


reichlab/covidData documentation built on April 5, 2024, 5 p.m.