knitr::opts_chunk$set(echo = TRUE) library(NNS) library(data.table) data.table::setDTthreads(2L) options(mc.cores = 1) Sys.setenv("OMP_THREAD_LIMIT" = 2)
library(NNS) library(data.table) require(knitr) require(rgl) require(meboot)
The underlying assumptions of traditional autoregressive models are well known. The resulting complexity with these models leads to observations such as,
``We have found that choosing the wrong model or parameters can often yield poor results, and it is unlikely that even experienced analysts can choose the correct model and parameters efficiently given this array of choices.''
NNS
simplifies the forecasting process. Below are some examples demonstrating NNS.ARMA
and its assumption free, minimal parameter forecasting method.
NNS.ARMA
has the ability to fit a linear regression to the relevant component series, yielding very fast results. For our running example we will use the AirPassengers
dataset loaded in base R.
We will forecast 44 periods h = 44
of AirPassengers
using the first 100 observations training.set = 100
, returning estimates of the final 44 observations. We will then test this against our validation set of tail(AirPassengers,44)
.
Since this is monthly data, we will try a seasonal.factor = 12
.
Below is the linear fit and associated root mean squared error (RMSE) using method = "lin"
.
nns_lin = NNS.ARMA(AirPassengers, h = 44, training.set = 100, method = "lin", plot = TRUE, seasonal.factor = 12, seasonal.plot = FALSE) sqrt(mean((nns_lin - tail(AirPassengers, 44)) ^ 2))
Now we can try using a nonlinear regression on the relevant component series using method = "nonlin"
.
nns_nonlin = NNS.ARMA(AirPassengers, h = 44, training.set = 100, method = "nonlin", plot = FALSE, seasonal.factor = 12, seasonal.plot = FALSE) sqrt(mean((nns_nonlin - tail(AirPassengers, 44)) ^ 2))
[1] 18.77889
We can test a series of seasonal.factors
and select the best one to fit. The largest period to consider would be 0.5 * length(variable)
, since we need more than 2 points for a regression! Remember, we are testing the first 100 observations of AirPassengers
, not the full 144 observations.
seas = t(sapply(1 : 25, function(i) c(i, sqrt( mean( (NNS.ARMA(AirPassengers, h = 44, training.set = 100, method = "lin", seasonal.factor = i, plot=FALSE) - tail(AirPassengers, 44)) ^ 2) ) ) ) ) colnames(seas) = c("Period", "RMSE") seas
Now we know seasonal.factor = 12
is our best fit, we can see if there's any benefit from using a nonlinear regression. Alternatively, we can define our best fit as the corresponding seas$Period
entry of the minimum value in our seas$RMSE
column.
a = seas[which.min(seas[ , 2]), 1]
Below you will notice the use of seasonal.factor = a
generates the same output.
nns = NNS.ARMA(AirPassengers, h = 44, training.set = 100, method = "nonlin", seasonal.factor = a, plot = TRUE, seasonal.plot = FALSE) sqrt(mean((nns - tail(AirPassengers, 44)) ^ 2))
Note: You may experience instances with monthly data that report seasonal.factor
close to multiples of 3, 4, 6 or 12. For instance, if the reported seasonal.factor = {37, 47, 71, 73}
use (seasonal.factor = c(36, 48, 72))
by setting the modulo
parameter in NNS.seas(..., modulo = 12)
. The same suggestion holds for daily data and multiples of 7, or any other time series with logically inferred cyclical patterns. The nearest periods to that modulo
will be in the expanded output.
NNS.seas(AirPassengers, modulo = 12, plot = FALSE)
seasonal.factor
NNS also offers a wrapper function NNS.ARMA.optim()
to test a given vector of seasonal.factor
and returns the optimized objective function (in this case RMSE written as obj.fn = expression( sqrt(mean((predicted - actual)^2)) )
) and the corresponding periods, as well as the NNS.ARMA
regression method used. Alternatively, using external package objective functions work as well such as obj.fn = expression(Metrics::rmse(actual, predicted))
.
NNS.ARMA.optim()
will also test whether to regress the underlying data first, shrink
the estimates to their subset mean values, include a bias.shift
based on its internal validation errors, and compare different weights
of both linear and nonlinear estimates.
Given our monthly dataset, we will try multiple years by setting seasonal.factor = seq(12, 60, 6)
every 6 months based on our NNS.seas() insights above.
nns.optimal = NNS.ARMA.optim(AirPassengers, training.set = 100, seasonal.factor = seq(12, 60, 6), obj.fn = expression( sqrt(mean((predicted - actual)^2)) ), objective = "min", pred.int = .95, plot = TRUE) nns.optimal
[1] "CURRNET METHOD: lin" [1] "COPY LATEST PARAMETERS DIRECTLY FOR NNS.ARMA() IF ERROR:" [1] "NNS.ARMA(... method = 'lin' , seasonal.factor = c( 12 ) ...)" [1] "CURRENT lin OBJECTIVE FUNCTION = 35.3996540135277" [1] "BEST method = 'lin', seasonal.factor = c( 12 )" [1] "BEST lin OBJECTIVE FUNCTION = 35.3996540135277" [1] "CURRNET METHOD: nonlin" [1] "COPY LATEST PARAMETERS DIRECTLY FOR NNS.ARMA() IF ERROR:" [1] "NNS.ARMA(... method = 'nonlin' , seasonal.factor = c( 12 ) ...)" [1] "CURRENT nonlin OBJECTIVE FUNCTION = 18.7788946994645" [1] "BEST method = 'nonlin' PATH MEMBER = c( 12 )" [1] "BEST nonlin OBJECTIVE FUNCTION = 18.7788946994645" [1] "CURRNET METHOD: both" [1] "COPY LATEST PARAMETERS DIRECTLY FOR NNS.ARMA() IF ERROR:" [1] "NNS.ARMA(... method = 'both' , seasonal.factor = c( 12 ) ...)" [1] "CURRENT both OBJECTIVE FUNCTION = 26.9404229568573" [1] "BEST method = 'both' PATH MEMBER = c( 12 )" [1] "BEST both OBJECTIVE FUNCTION = 26.9404229568573" $periods [1] 12 $weights NULL $obj.fn [1] 18.77889 $method [1] "nonlin" $shrink [1] FALSE $nns.regress [1] FALSE $bias.shift [1] 0 $errors [1] -14.47605733 -21.86813452 -19.68680481 -32.74950538 -23.55099666 -17.40139233 -13.70469091 -6.24837561 -2.48949314 2.13237206 16.92258707 23.90974592 4.53920624 [14] -2.80578226 -8.66505386 -35.95512814 6.15047733 -3.05392579 4.47214630 18.85551669 2.32637841 0.08983381 -1.04028830 2.89582233 -24.22859693 -6.13584252 [27] -27.38320978 -53.70280267 -21.86253259 -23.90679863 -23.77369245 -22.36041401 -29.22056474 -26.31507238 12.93035387 -34.59586321 -47.79031322 -34.85559482 -62.85716413 [40] -64.07451099 -35.67072562 -50.91910417 -27.91240080 -22.72623962 $results [1] 338.5400 405.6487 448.7971 437.6360 381.8544 325.4751 288.1899 326.6740 337.1004 319.5942 381.6468 371.7323 373.1716 452.1286 498.6929 485.0008 422.3997 357.9051 [19] 318.5071 360.0295 371.1142 350.8923 419.5388 408.2967 408.8936 499.6842 549.5283 533.5434 464.0692 391.2918 349.2869 394.1562 404.3549 381.6612 456.5031 443.9546 [37] 443.9982 546.2672 599.0085 581.0612 504.6510 423.7409 379.3403 427.2016 $lower.pred.int [1] 287.2344 354.3431 397.4915 386.3304 330.5488 274.1695 236.8843 275.3684 285.7949 268.2886 330.3413 320.4267 321.8661 400.8231 447.3873 433.6952 371.0941 306.5995 [19] 267.2016 308.7239 319.8086 299.5867 368.2332 356.9912 357.5880 448.3786 498.2227 482.2379 412.7636 339.9862 297.9813 342.8507 353.0493 330.3556 405.1975 392.6490 [37] 392.6926 494.9616 547.7030 529.7556 453.3454 372.4353 328.0347 375.8961 $upper.pred.int [1] 368.1156 435.2243 478.3727 467.2115 411.4300 355.0506 317.7655 356.2495 366.6760 349.1697 411.2224 401.3078 402.7472 481.7042 528.2685 514.5763 451.9753 387.4807 [19] 348.0827 389.6050 400.6897 380.4679 449.1144 437.8723 438.4692 529.2597 579.1039 563.1190 493.6448 420.8674 378.8625 423.7318 433.9305 411.2367 486.0786 473.5301 [37] 473.5738 575.8427 628.5841 610.6368 534.2265 453.3165 408.9159 456.7772
{width="600" height="400"}
nns.optimal = list() nns.optimal$periods = 12 nns.optimal$weights = NULL nns.optimal$method = "nonlin" nns.optimal$shrink = FALSE nns.optimal$results = c(354.2580, 421.2452, 462.4395, 453.0669, 395.8280, 338.4172, 301.1178, 338.6083, 347.7440, 330.7530, 393.0655, 383.2619, 390.9250, 468.8563, 511.8161, 501.4936, 436.7415, 370.9154, 331.3098, 371.0849, 380.7716, 361.0259, 430.2580, 418.6685, 427.7316, 516.8815, 561.5732, 550.3086, 478.0325, 403.7194, 361.7944, 403.9807, 413.6136, 390.9586, 467.3674, 453.9804, 464.4469, 564.6356, 611.0813, 598.8694, 519.0765, 436.3233, 392.0875, 436.6022) nns.optimal$bias.shift = 0
We can forecast another 50 periods out-of-sample (h = 50
), by dropping the training.set
parameter while generating the 95% prediction intervals.
NNS.ARMA.optim(AirPassengers, seasonal.factor = seq(12, 60, 6), obj.fn = expression( sqrt(mean((predicted - actual)^2)) ), objective = "min", pred.int = .95, h = 50, plot = TRUE)
{width="600" height="400"}
seasonal.factor = c(1, 2, ...)
We included the ability to use any number of specified seasonal periods simultaneously, weighted by their strength of seasonality. Computationally expensive when used with nonlinear regressions and large numbers of relevant periods.
weights
Instead of weighting by the seasonal.factor
strength of seasonality, we offer the ability to weight each per any defined compatible vector summing to 1.\
Equal weighting would be weights = "equal"
.
pred.int
Provides the values for the specified prediction intervals within [0,1] for each forecasted point and plots the bootstrapped replicates for the forecasted points.
seasonal.factor = FALSE
We also included the ability to use all detected seasonal periods simultaneously, weighted by their strength of seasonality. Computationally expensive when used with nonlinear regressions and large numbers of relevant periods.
best.periods
This parameter restricts the number of detected seasonal periods to use, again, weighted by their strength. To be used in conjunction with seasonal.factor = FALSE
.
modulo
To be used in conjunction with seasonal.factor = FALSE
. This parameter will ensure logical seasonal patterns (i.e., modulo = 7
for daily data) are included along with the results.
mod.only
To be used in conjunction with seasonal.factor = FALSE & modulo != NULL
. This parameter will ensure empirical patterns are kept along with the logical seasonal patterns.
dynamic = TRUE
This setting generates a new seasonal period(s) using the estimated values as continuations of the variable, either with or without a training.set
. Also computationally expensive due to the recalculation of seasonal periods for each estimated value.
plot
, seasonal.plot
These are the plotting arguments, easily enabled or disabled with TRUE
or FALSE
. seasonal.plot = TRUE
will not plot without plot = TRUE
. If a seasonal analysis is all that is desired, NNS.seas
is the function specifically suited for that task.
The extension to a generalized multivariate instance is provided in the following documentation of the NNS.VAR()
function:
If the user is so motivated, detailed arguments and proofs are provided within the following:
Sys.setenv("OMP_THREAD_LIMIT" = "")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.