garma: garma: A package for estimating and foreasting Gegenbauer...

View source: R/garma_main.R

garmaR Documentation

garma: A package for estimating and foreasting Gegenbauer time series models.

Description

The GARMA package provides the main function "garma" as well as print, summary, predict, forecast and plot/ggplot options.

The garma function is the main function for the garma package. Depending on the parameters it will calculate the parameter estimates for the GARMA process, and if available the standard errors (se's) for those parameters.

Usage

garma(
  x,
  order = c(0L, 0L, 0L),
  periods = NULL,
  k = 1,
  include.mean = (order[2] == 0L),
  include.drift = FALSE,
  xreg = NULL,
  method = "Whittle",
  d_lim = c(0, 0.5),
  opt_method = c("cobyla", "solnp"),
  control = NULL
)

Arguments

x

(num) This should be a numeric vector representing the process to estimate. A minimum length of 96 is required.

order

(numeric vector) This should be a vector (similar to the stats::arima order parameter) which will give the order of the process to fit. The format should be list(p,d,q) where p, d, and q are all positive integers. p represents the degree of the autoregressive process to fit, q represents the order of the moving average process to fit and d is the (integer) differencing to apply prior to any fitting. WARNING: Currently only d==0 or d==1 are allowed.

periods

(num) This parameter can be used to specify a fixed period or set of periods for the Gegenbauer periodicity. For instance if you have monthly data, then it might be sensible (after an examination of the periodogram) to set periods = 12. The default value is NULL. Either 'periods' or 'k' parameters must be specified but not both - 'periods' implies fixed period(s) are to be used and 'k' implies that the periods should be estimated.

k

(int) This parameter indicates that the algorithm should estimate the 'k' frequencies as a part of the model. An alternative is the 'periods' parameter which can be used to specify exactly which periods should be used by the model.

This parameter can also be interpreted as specifying the number of (multiplicative) Gegenbauer terms to fit in the model.

include.mean

(bool) A boolean value indicating whether a mean should be fit. Note that no mean term is fit if the series is integer differenced.

include.drift

(bool) A boolean value indicating whether a 'drift' term should be fit to the predictions. The default is to fit a drift term to the predictions if the process is integer-differenced.

xreg

(numeric matrix) A numerical vector or matrix of external regressors, which must have the same number of rows as x. It should not have any NA values. It should not be a data frame. The default value is NULL.

Note that the algorithm used here is that if any 'xreg' is supplied, then a linear regression model is fit first, and the GARMA model is then based on the residuals from that regression model.

method

(character) This defines the estimation method for the routine. The valid values are 'CSS', 'Whittle', and 'WLL'. The default ('Whittle') method will generally return very accurate estimates quite quickly, provided the assumption of a Gaussian distribution is even approximately correct, and is probably the method of choice for most users. For the theory behind this, refer Giraitis et. al. (2001).

The 'CSS' method is a conditional 'sum-of-squares' technique and can be quite slow. Reference: Robinson (2006), Chung (1996). Note that the paper of Chung (1996) was partially critisised by Giraitis et. al. (2001), however still contains useful results.

'WLL' is a new technique, originally developed by the author of this package and which appears to work well even if the \epsilon_{t} are highly skewed and/or have heavy tails (skewed and/or lepto-kurtic). However the asymptotic theory for the WLL method is not complete and so standard errors are not available for most parameters. Refer Hunt et. al. (2021).

d_lim

(list) the limits for the d parameter. The default is 'c(0,0.5)', which restricts the model to be stationary. However sometimes it is desirable to understand what the unrestricted value might be.

opt_method

(character) This names the optimisation method used to find the parameter estimates. This may be a list of methods, in which case the methods are applied in turn, each using the results of the previous one as the starting point for the next. The default is to use c('solnp', 'cobyla'). For some data or some models, however, other methods may work well.

Supported algorithms include:

  • 'cobyla' algorithm in package nloptr

  • 'directL' algorithm in package nloptr

  • 'solnp' from Rsolnp package

  • 'gosolnp' from Rsolnp package.

Note that the algorithms are selected to be those which do not require derivatives, even numerically calculated derivatives. The function being optimised by 'garma()' has a point of discontinuity at the minimum value - the point we are trying to find. This means that standard algorithms like BFGS et al. perform very poorly here.

Note further that if you specify a value of 'k' > 1, then inequality constraints are required, and this will further limit the list of supported routines.

control

(list) list of optimisation routine specific values.

Details

The GARMA model is specified as

\displaystyle{\phi(B)\prod_{i=1}^{k}(1-2u_{i}B+B^{2})^{d_{i}}(1-B)^{id} (X_{t}-\mu)= \theta(B) \epsilon _{t}}

where

  • \phi(B) represents the short-memory Autoregressive component of order p,

  • \theta(B) represents the short-memory Moving Average component of order q,

  • (1-2u_{i}B+B^{2})^{d_{i}} represents the long-memory Gegenbauer component (there may in general be k of these),

  • id represents the degree of integer differencing, where as d_i represents the degree of fractional differencing. Note that id is a value supplied by the user (the second number on the 'order=' parameter - similarly to the way that the base R 'arima' function works) whereas d_i is estimated by this function.

  • X_{t} represents the observed process,

  • \epsilon_{t} represents the random component of the model - these are assumed to be uncorrelated but identically distributed variates. Generally the routines in this package will work best if these have an approximate Gaussian distribution.

  • B represents the Backshift operator, defined by B X_{t}=X_{t-1}.

when k=0, then this is just a short memory model as fit by the stats "arima" function.

Value

An S3 object of class "garma_model".

Author(s)

Richard Hunt

References

C Chung. A generalized fractionally integrated autoregressive moving-average process. Journal of Time Series Analysis, 17(2):111-140, 1996. DOI: https://doi.org/10.1111/j.1467-9892.1996.tb00268.x

L Giraitis, J Hidalgo, and P Robinson. Gaussian estimation of parametric spectral density with unknown pole. The Annals of Statistics, 29(4):987–1023, 2001. DOI: https://doi.org/10.1214/AOS/1013699989

R Hunt, S Peiris, N Webe. A General Frequency Domain Estimation Method for Gegenbauer Processes. Journal of Time Series Econometrics, 13(2):119-144, 2021. DOI: https://doi.org/10.1515/jtse-2019-0031

R Hunt, S Peiris, N Weber. Estimation methods for stationary Gegenbauer processes. Statistical Papers 63:1707-1741, 2022. DOI: https://doi.org/10.1007/s00362-022-01290-3

P. Robinson. Conditional-sum-of-squares estimation of models for stationary time series with long memory. IMS Lecture Notes Monograph Series, Time Series and Related Topics, 52:130-137, 2006. DOI: https://doi.org/10.1214/074921706000000996.

See Also

Useful links:

Examples

data(AirPassengers)
ap <- as.numeric(diff(AirPassengers, 12))
print(garma(ap, order = c(9, 1, 0), k = 0, method = "CSS", include.mean = FALSE))
# Compare with the built-in arima function
print(arima(ap, order = c(9, 1, 0), include.mean = FALSE))

garma documentation built on April 4, 2025, 2:13 a.m.