assemble.data.lags: Assemble human case data and environmental lags
In khelmsmith/flm_NE_WNV: Functional Linear Modeling For Vector-borne Disease

assemble.data.lags

R Documentation

Assemble human case data and environmental lags

Description

assemble.data.lags() expects inputs to follow certain conventions (see below).

Usage

assemble.data.lags(
  pop,
  cases,
  weather,
  spi,
  spei,
  target.date,
  start.year,
  in.seed = NULL,
  lag.lengths = c(12, 18, 24, 30, 36)
)

Arguments

`pop`	County populations, a data frame with 5 variables County , fips (5 characters), year , pop100K , density .
`cases`	data on annual numbers of human cases in each county. A data.frame with 3 variables, County , year , and cases .
`weather`	monthly temperature and precipitation data for for each county. A data.frame with County , fips (5 characters), year , month (integer), tmean , and ppt .
`spi`	monthly values of the Standardized Precipitation Index for each county. County , fips (5 characters), year , month (integer), spi .
`spei`	monthly values of the Standardized Precipitation and Evapotranspiration Index for each county. County , fips (5 characters), year , month (integer), spei .
`target.date`	The last date to include for calculation of lags, a character string with ISO XXX format (yyyy-mm-dd).
`start.year`	The first year to include in the training data. Should be coercible to integer.
`in.seed`	If not NULL, the starting number for the random number generator. This makes the results repeatable. If NULL, treats cases as actual data
`lag.lengths`	the number of months to go backwards when creating lag matrices. Numeric vector.

Details

Case data are trimmed to between

start.year

and

target.year

. predictor variables are trimmed to between January of

start.year

and the date specified in

target.year

The

County

variable can be any character string unique to each geographic locale, as long as the format is identical across all input data frames. Simulated data (in.seed not null) is predictions of a model that was trained on actual numbers of cases as recorded in CDC's Arbonet database. It excludes Arthur County, because no cases have been recorded there to date, and we had to exclude it from our modeling to get it to work. Weather data included in package for Nebraska comes from National Centers for Environmental Information, National Climatic Data Center (ftp://ftp.ncdc.noaa.gov/pub/data/cirs/climdiv/). SPEI and SPI from Westwide Drought Tracker netcdf files. See Abatzoglou, J. T., McEvoy, D. J., & Redmond, K. T. (2017). The West Wide Drought Tracker: Drought monitoring at fine spatial scales. Bulletin of the American Meteorological Society, 98(9), 1815–1820. https://doi.org/10.1175/BAMS-D-16-0193.1.