process_MF: Process mixed frequency

Description Usage Arguments Details Value Examples

Description

Process mixed frequency data for nowcasting applications by identifying the missing observations in the contemporaneous data and replicating this pattern of missing observations in the historical data prior to aggregation. This allows the incorporation of all available information into the model while still using uniform frequency models to actually generate predictions, and can thus be applied to a wide array of econometrics and machine learning applications.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
process_MF(
  LHS,
  RHS,
  LHS_lags = 1,
  RHS_lags = 1,
  as_of = NULL,
  frq = c("auto", "week", "month", "quarter", "year"),
  date_name = "ref_date",
  id_name = "series_name",
  value_name = "value",
  pub_date_name = "pub_date",
  return_dt = TRUE
)

Arguments

LHS

Left hand side data in long format. May include multiple LHS variables, but LHS variance MUST have the same frequency.

RHS

Right hand side data in long format at any frequency.

LHS_lags

Number of lags of LHS variables to include in output.

RHS_lags

Number of lags of RHS variables to include in output (may be 0, indicating contemporaneous values only).

as_of

Backtesting the model "as of" this date; requires that 'pub_date' is specified in the data

frq

Frequency of LHS data, one of 'week', 'month', 'quarter', 'year'. If not specified, the function will attempt to automatically identify the frequency.

date_name

Name of date column in data.

id_name

Name of ID column in the data.

value_name

Name of value column in the data.

pub_date_name

Name of publication date in the data.

return_dt

T/F, should the function return a 'data.table'? IF FALSE the function will return matrix data.

Details

Right hand side data will always include observations contemporaneous with LHS data. Use 'RHS_lags' to add lags of RHS data to the output, and 'LHS_lags' to add lags of LHS data to the output. By default the function will return data in long format designed to be used with the 'dateutils' function 'process()'. Specifying 'return_dt = FALSE' will return LHS variables in the matrix 'Y', RHS variables in the matrix 'X', and corresponding dates (by index) in the date vector 'dates'.

Value

data.table in long format (unless ‘return_dt = FALSE'). Variables ending in ’0' are contemporaneous, ending in '1' are at one lag, '2' at two lags, etc.

Examples

1
2
3
4
 
LHS <- fred[series_name == "gdp constant prices"]
RHS <- fred[series_name != "gdp constant prices"]
dt <- process_MF(LHS, RHS)

dateutils documentation built on Nov. 10, 2021, 5:09 p.m.