forecast_multivariate: Forecast with multivariate models
In tylerJPike/OOS: Out-of-Sample Time Series Forecasting

Description Usage Arguments Value Examples

A function to estimate multivariate forecasts out-of-sample. Methods available include: vector auto-regression, linear regression, lasso regression, ridge regression, elastic net, random forest, tree-based gradient boosting machine, and single-layer neural network. See package website for most up-to-date list of available models.

forecast_multivariate(
  Data,
  forecast.dates,
  target,
  horizon,
  method,
  rolling.window = NA,
  freq,
  lag.variables = NULL,
  lag.n = NULL,
  outlier.clean = FALSE,
  outlier.variables = NULL,
  outlier.bounds = c(0.05, 0.95),
  outlier.trim = FALSE,
  outlier.cross_section = FALSE,
  impute.missing = FALSE,
  impute.method = "kalman",
  impute.variables = NULL,
  impute.verbose = FALSE,
  reduce.data = FALSE,
  reduce.variables = NULL,
  reduce.ncomp = NULL,
  reduce.standardize = TRUE,
  parallel.dates = NULL,
  return.models = FALSE,
  return.data = FALSE
)

`Data`	data.frame: data frame of target variable, exogenous variables, and observed date (named 'date'); may alternatively be a `ts`, `xts`, or `zoo` object to forecast
`forecast.dates`	date: dates forecasts are created
`target`	string: column name in Data of variable to forecast
`horizon`	int: number of periods into the future to forecast
`method`	string: methods to use
`rolling.window`	int: size of rolling window, NA if expanding window is used
`freq`	string: time series frequency; day, week, month, quarter, year
`lag.variables`	string: vector of variables to lag each time step, if lag.n is not null then the default is all non-date variables
`lag.n`	int: number of lags to create
`outlier.clean`	boolean: if TRUE then clean outliers
`outlier.variables`	string: vector of variables to purge of outlier, default is all but 'date' column
`outlier.bounds`	double: vector of winsorizing minimum and maximum bounds, c(min percentile, max percentile)
`outlier.trim`	boolean: if TRUE then replace outliers with NA instead of winsorizing bound
`outlier.cross_section`	boolean: if TRUE then remove outliers based on cross-section (row-wise) instead of historical data (column-wise)
`impute.missing`	boolean: if TRUE then impute missing values
`impute.method`	string: select which method to use from the imputeTS package; 'interpolation', 'kalman', 'locf', 'ma', 'mean', 'random', 'remove','replace', 'seadec', 'seasplit'
`impute.variables`	string: vector of variables to impute missing values, default is all numeric columns
`impute.verbose`	boolean: show start-up status of impute.missing.routine
`reduce.data`	boolean: if TRUE then reduce dimension
`reduce.variables`	string: vector of variables to impute missing values, default is all numeric columns
`reduce.ncomp`	int: number of factors to create
`reduce.standardize`	boolean: normalize variables (mean zero, variance one) before estimating factors
`parallel.dates`	int: the number of cores available for parallel estimation
`return.models`	boolean: if TRUE then return list of models estimated each forecast.date
`return.data`	boolean: if True then return list of information.set for each forecast.date

data.frame with a row for each forecast by model and forecasted date

 # simple time series
 A = c(1:100) + rnorm(100)
 B = c(1:100) + rnorm(100)
 C = c(1:100) + rnorm(100)
 date = seq.Date(from = as.Date('2000-01-01'), by = 'month', length.out = 100)
 Data = data.frame(date = date, A, B, C)

 # run forecast_univariate
 forecast.multi =
     forecast_multivariate(
       Data = Data,
       target = 'A',
       forecast.dates = tail(Data$date,5),
       method = c('ols','var'),
       horizon = 1,
       # information set
       rolling.window = NA,
       freq = 'month',
       # data prep
       lag.n = 4,
       outlier.clean = TRUE,
       impute.missing = TRUE)