create_train_and_predict_matrices: Create training and test data matrices and a training...

Description Usage Arguments Value Examples

View source: R/matrix.R

Description

Create training and test data matrices and training response for a set of given aheads. Works for both single ahead values and a vector of ahead values. For multiple ahead values, the function has the ability to return separate data matrices and responses for each ahead, or a single data matrix and response matrix for all aheads at once.

Usage

1
2
3
4
5
6
create_train_and_predict_matrices(
  lagged_df,
  ahead,
  training_window_size,
  aheads_separate = TRUE
)

Arguments

lagged_df

Data frame of lagged data. It should have the following columns:

  • geo_value: Strings of geographic locations.

  • time_value: Dates of training data.

  • Covariate columns: Columns with names of the form value-{days}:{signal} or value+0:{signal} whose values correspond to {signal} {days} before time_value.

  • Response columns: Columns with names of the form response+{n}:{response} whose values correspond to {response} {n} incidence period units after time_value.

A data frame in this format can be made using covidcast::aggregate_signals() and modeltools::get_response_columns().

ahead

Number of incidence period units (i.e., epiweeks, days, etc.) ahead to forecast. Can be a single positive integer or a vector of positive integers. Note that for each {a} in ahead, the column response+{a}:{response} should be present in lagged_df.

training_window_size

Size of the local training window in days to use. For example, if training_window_size = 14, then to make a 1-day-ahead forecast on December 15, we train on data from December 1 to December 14.

aheads_separate

If length(ahead) > 1, should there be separate data matrices and responses for each ahead? Default is TRUE.

Value

For a single ahead value, named list with entries:

For multiple ahead values and aheads_separate = TRUE, a list having the same length as ahead, with each element being a named list as above. For multiple ahead values and ahead_separate = FALSE, a named list as above, except train_y is a matrix of responses rather than a vector.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
## Not run: 
create_train_and_predict_matrices(
  tibble(
    geo_value = rep(c("az", "wv"), 5),
    time_value = rep(
      as.Date(c("2021-01-25", "2021-01-26", "2021-01-27", "2021-01-28", "2021-01-29")),
      each = 2),
    `value-2:signal_1` = seq(-3, 6),
    `value-1:signal_1` = seq(-1, 8),
    `value+0:signal_1` = seq(1, 10),
    `response+2:signal_1` = c(seq(5, 10), rep(NA, 4))
  ),
  ahead = 2,
  training_window_size = 1)

## End(Not run)

dshemetov/modeltools-mirror documentation built on Jan. 7, 2022, 12:23 a.m.