gets.mean: General-to-Specific (GETS) Modelling of an AR-X model with...
In AutoSEARCH: General-to-Specific (GETS) Modelling

Description Usage Arguments Details Value Author(s) References See Also Examples

The starting model is referred to as the General Unrestricted Model (GUM). The gets.mean function undertakes multi-path GETS model selection of the mean specification, whereas gets.vol does the same for the log-variance specification.

gets.mean(y, mc = NULL, ar = NULL, ewma = NULL, mx = NULL,
  arch = NULL, asym = NULL, log.ewma = NULL, vx = NULL,
  keep = NULL, p = 2, varcov.mat = c("ordinary", "white"),
  t.pval = 0.05, do.pet = TRUE, wald.pval = 0.05,
  ar.LjungB = c(2, 0.025), arch.LjungB = c(2, 0.025),
  tau = 2, info.method = c("sc", "aic", "hq"),
  info.resids = c("mean", "standardised"),
  include.empty = FALSE, zero.adj = 0.1, vc.adj = TRUE,
  tol = 1e-07, LAPACK = FALSE, max.regs = 1000,
  verbose = TRUE, smpl = NULL, alarm = FALSE)
gets.vol(e, arch=NULL, asym=NULL, log.ewma=NULL, vx=NULL,
  p=2, keep=c(1), t.pval=0.05, wald.pval=0.05, do.pet=TRUE,
  ar.LjungB=c(1, 0.025), arch.LjungB=c(1, 0.025), tau=2,
  info.method=c("sc", "aic", "hq"),
  info.resids=c("standardised", "log-sigma"), include.empty=FALSE,
  zero.adj=0.1, vc.adj=TRUE, tol=1e-07, LAPACK=FALSE, max.regs=1000,
  verbose=TRUE, alarm=FALSE, smpl=NULL)

`y`	numeric vector, time-series or zoo object. Note that missing values in the beginning or at the end of the series is allowed, as they are removed with the na.trim command from the zoo package
`e`	numeric vector, time-series or zoo object. Note that missing values in the beginning or at the end of the series is allowed, as they are removed with the na.trim command from the zoo package
`mc`	logical, TRUE or FALSE (default). TRUE includes intercept in the mean specification, FALSE does not
`ar`	integer vector, say, c(2,4) or 1:4. The AR-lags to include in the specification
`ewma`	either NULL (default) or a list with arguments sent to the eqwma function. In the latter case a lagged moving average of y is included as a regressor
`mx`	numeric matrix, time-series or zoo object of conditioning covariates. Note that missing values in the beginning or at the end of the series is allowed, as they are removed with the na.trim command from the zoo package
`arch`	integer vector, say, c(1,3) or 2:5. The log-ARCH terms to include in the log-volatility specification
`asym`	integer vector, say, c(1) or 1:3. The asymmetry or leverage terms to include in the log-volatility specification
`log.ewma`	NULL (default) or a list. If NULL then log(EWMA) is not included as volatility proxy. If a list, then log(EWMA) is included as a volatility proxy.
`vx`	numeric matrix, time-series or zoo object of conditioning covariates. Note that missing values in the beginning or at the end of the series is allowed, as they are removed with the na.trim command from the zoo package
`keep`	NULL (default) or an integer vector. If keep = NULL, then no regressors are excluded from removal. Otherwise, the regressors associated with the numbers in keep are excluded from the removal space. For example, keep=c(1) excludes the constant from removal. The regressor numbering is contained in the reg.no column of the gum.mean data frame (see below)
`p`	numeric value greater than zero. The power of the log-volatility specification.
`varcov.mat`	character vector, "ordinary" or "white". If "ordinary" then the ordinary variance-covariance matrix is used for inference. Otherwise the White (1980) heteroscedasticity robust matrix is used
`t.pval`	numeric value between 0 and 1. The significance level used for the two-sided regressor significance tests
`do.pet`	logical, TRUE (default) or FALSE. If TRUE then a Parsimonious Encompassing Test (PET) against the GUM is undertaken at each regressor removal for the joint significance of all the deleted regressors along the current path
`wald.pval`	numeric value between 0 and 1. The significance level used for the PETs
`ar.LjungB`	NULL or a two-element vector where the first element contains the order of a Ljung and Box (1979) test for serial correlation in the standardised residuals, and where the second element contains the significance level. If NULL, then the standardised residuals are not checked for serial correlation after each removal. The default is c(2, 0.025)
`arch.LjungB`	NULL or a two-element vector where the first element contains the order of a Ljung and Box (1979) test for ARCH (serial correlation in the squared standardised residuals), and where the second element contains the significance level. If NULL, then the standardised residuals are not checked for ARCH after each removal. The default is c(2, 0.025)
`tau`	NULL or a numeric value greater than 1. If NULL, then the shape parameter in a Generalised Error Distribution (GED) of the standardised residuals is estimated for the log-likelihood used in the calculation of the information criterion. If tau is equal to a numeric value, a GED(tau) is used. Default: tau=2 (i.e. the standard normal density)
`info.method`	character string, "sc" (default), "aic" or "hq", which determines the information criterion used to select among terminal models. The abbreviations are short for the Schwarz or Bayesian information criterion (sc), the Akaike information criterion (aic) and the Hannan-Quinn (hq) information criterion
`info.resids`	character string, "mean" (default) or "standardised" which sets the residuals to be used in the computation of the information criterion
`include.empty`	logical, TRUE or FALSE (default). If TRUE then an empty model is included among the terminal models, if it passes the diagnostic tests, even if it is not equal to one of the terminals
`zero.adj`	numeric value between 0 and 1. The quantile adjustment for zero values. The default 0.1 means that the zero residuals are replaced by means of the 10 percent quantile of the absolute residuals before taking the logarithm
`vc.adj`	logical, TRUE (default) or FALSE. If true then the log-volatility constant is adjusted by means of the estimate of E[log(z^2)]. This adjustment is needed for the standardised residuals to have unit variance. If FALSE then the log-volatility constant is not adjusted
`tol`	numeric value (default = 1e-07). The tolerance for detecting linear dependencies in the columns of the regressors (see qr() function). Only used if LAPACK is FALSE
`LAPACK`	logical, TRUE or FALSE (default). If true use LAPACK otherwise use LINPACK (see qr() function)
`max.regs`	integer value, sets the maximum number of regressions along a deletion path. Default: max.regs=1000
`verbose`	logical, TRUE (default) or FALSE. FALSE returns less output and is therefore faster
`smpl`	Either NULL (default; the whole sample is used for estimation) or a two-element vector of dates with the start and end dates of the sample to be used in estimation. For example, smpl=c("2001-01-01", "2009-12-31")
`alarm`	Logical, either TRUE or FALSE (default). If TRUE, then a sound or beep is emitted when the specification search terminates in order to alert the user

See Sucarrat and Escribano (2012)

A list with a subset of the following:

`volatility.fit`	zoo-object with the fitted values of the volatility (sigma^p) of the final log-volatility specification
`resids.ustar`	zoo-object with the residuals of the AR-representation of the final log-volatility specification
`resids`	zoo-object with the residuals of the final mean specification
`resids.std`	zoo-object with the standardised residuals
`Elogzp`	estimate of E[log(z^p)]
`call`	the function call
`gum.mean`	a data frame with the estimation results of the GUM
`gum.volatility`	a data frame with the estimation results of the log-volatility GUM
`gum.diagnostics`	data frame with selected diagnostics of the GUM
`keep`	if any, the regressors that are excluded from deletion
`insigs.in.gum`	a numeric integer vector with the insignificant regressors of the GUM
`paths`	a list containing the simplification paths, that is, the sequences of deleted regressors
`terminals`	the distinct terminal models
`terminals.results`	the value and type of the information criterion (info) used in selecting among terminal specifications, and the number of observations (T) and parameters (k) used in the calculation of the information criterion
`specific.mean`	data frame with the estimation results of the final mean specification
`specific.volatility`	data frame with the estimation results of the final log-volatility specification
`specific.diagnostics`	data frame with selected diagnostics of the standardised residuals

Genaro Sucarrat (http://www.sucarrat.net/)

Genaro Sucarrat and Alvaro Escribano (2012): 'Automated Financial Model Selection: General-to-Specific Modelling of the Mean and Volatility Specifications', Oxford Bulletin of Economics and Statistics 74, Issue no. 5 (October), pp. 716-735

G. Ljung and G. Box (1979): 'On a Measure of Lack of Fit in Time Series Models'. Biometrika 66, pp. 265-270

AutoSEARCH package: sm
gets package: arx, getsm, getsv, isat

#Generate AR(1) model and four independent normal regressors:
set.seed(123)
y <- arima.sim(list(ar=0.4), 200)
xregs <- matrix(rnorm(4*200), 200, 4)

#General-to-Specific model selection of the mean:
mymodel <- gets.mean(y, mc=TRUE, ar=1:5, mx=xregs)

#General-to-Specific model selection of the mean
#with the intercept excluded from removal:
mymodel <- gets.mean(y, mc=TRUE, ar=1:5, mx=xregs, keep=1)

#General-to-Specific model selection of the mean
#with no intercept and with a log-ARCH(4) specification
#in the log-volatility using the standardised residuals
#when computing the log-likelihood for the information
#criterion:
mymodel <- gets.mean(y, mc=FALSE, ar=1:5, mx=xregs, arch=1:4,
  info.resids="standardised")

#General-to-Specific model selection of the mean with
#non-default serial-correlation diagnostics settings:
mymodel <- gets.mean(y, mc=TRUE, ar=1:5, mx=xregs,
  ar.LjungB=c(6, 0.05))

#General-to-Specific model selection of the mean with
#very liberal (i.e. 20 percent) significance levels (20 percent):
mymodel <- gets.mean(y, mc=TRUE, ar=1:5, mx=xregs, t.pval=0.2,
  wald.pval=0.2)
  
#Generate iid normal residuals and a matrix of independent
#normals:
set.seed(123)
e <- rnorm(200)
xregs <- matrix(rnorm(4*200), 200, 4)

#General-to-Specific model selection of log-volatility:
mymodel <- gets.vol(e, arch=1:5, vx=log(xregs^2))

#General-to-Specific model selection of log-volatility
#with the log-ARCH(1) term excluded from removal:
mymodel <- gets.vol(e, arch=1:5, vx=log(xregs^2), keep=2)

#General-to-Specific model selection of log-volatility
#with all the log-ARCH terms excluded from removal:
mymodel <- gets.vol(e, arch=1:5, vx=log(xregs^2), asym=1:2,
  log.ewma=list(length=5), keep=2:6)

#If e is a daily (weekends excluded) financial return series,
#then the following specification includes a lagged volatility
#proxy both for the week (5-day average of squared return) and
#for the month (20-day average of squared returns), in addition
#to five log-ARCH terms:
mymodel <- gets.vol(e, arch=1:5, log.ewma=list(length=c(5,20)) )

#General-to-Specific model selection with very liberal
#(20 percent) significance levels:
mymodel <- gets.vol(e, arch=1:5, vx=log(xregs^2), t.pval=0.2,
  wald.pval=0.2)