causality_pred: Out-of-sample Tests of Granger Causality

View source: R/causality_pred.R

causality_predR Documentation

Out-of-sample Tests of Granger Causality

Description

Test for Granger causality using out-of-sample prediction errors from an autoregression (AR) model, where some of the near-contemporaneous lags can be removed:

Y_t = \sum_{i=1}^{p1}\alpha_iY_{t-i} + \sum_{i=lag.restrict+1}^{p2}\beta_iX_{t-i} + e_t,

where Y_t is the dependent variable, X_t is the cause variable, p1 and p2 are the AR orders (if p.free = FALSE, p1 = p2), lag.restrict is the number of restricted first lags (see the argument lag.restrict).

Usage

causality_pred(
  y,
  cause = NULL,
  p = NULL,
  p.free = FALSE,
  lag.restrict = 0L,
  lag.max = NULL,
  k = 2,
  B = 500L,
  test = 0.3,
  cl = 1L
)

Arguments

y

matrix, data frame, or ts object with two columns (a dependent and an explanatory time-series variable). Missing values are not allowed.

cause

name of the cause variable. If not specified, the first variable in y is treated as the dependent variable and the second is treated as the cause.

p

a vector of one or two positive integers specifying the order p of autoregressive dependence. The input of length one is recycled, then p[1] is used for the dependent variable and p[2] is used for the cause variable. The user must specify p or lag.max. If lag.max is specified, the argument p is ignored.

p.free

logical value indicating whether the autoregressive orders for the dependent and cause variables should be selected independently. The default p.free = FALSE means the same autoregressive order is selected for both variables. Note that if p.free = TRUE and lag.max is specified, then lag.max[1] * (lag.max[2] - lag.restrict) models are compared, which might be slow depending on the maximal lags and sample size.

lag.restrict

integer for the number of short-term lags in the cause variable to remove from consideration (default is zero, meaning no lags are removed). This setting does not affect the dependent variable lags that are always present.

lag.max

a vector of one or two positive integers for the highest lag orders to explore. The input of length one is recycled, then lag.max[1] used for the dependent variable and lag.max[2] is used for the cause variable. The order is then selected using the Akaike information criterion (AIC; default), see the argument k to change the criterion. lag.max of length 2 automatically sets p.free = TRUE.

k

numeric scalar specifying the weight of the equivalent degrees of freedom part in the AIC formula. Default k = 2 corresponds to the traditional AIC. Use k = log(n) to use the Bayesian information criterion instead (see extractAIC).

B

number of bootstrap replications. Default is 500.

test

a numeric value specifying the size of the testing set. If test < 1, the value is treated as a proportion of the sample size to be used as the testing set. Otherwise, test is treated as the number of the most recent values to be used as the testing set. Default is 0.3, which means that 30% of the sample is used for calculating out-of-sample errors. The testing set is always at the end of the time series.

cl

parameter to specify computer cluster for bootstrapping passed to the package parallel (default cl = 1, means no cluster is used). Possible values are:

  • cluster object (list) produced by makeCluster. In this case, a new cluster is not started nor stopped;

  • NULL. In this case, the function will detect available cores (see detectCores) and, if there are multiple cores (>1), a cluster will be started with makeCluster. If started, the cluster will be stopped after the computations are finished;

  • positive integer defining the number of cores to start a cluster. If cl = 1 (default), no attempt to create a cluster will be made. If cl > 1, a cluster will be started (using makeCluster) and stopped afterward (using stopCluster).

Details

The tests include the MSE-t approach \insertCiteMcCracken_2007funtimes and MSE-correlation test as in Chapter 9.3 of \insertCiteGranger_Newbold_2016;textualfuntimes. The bootstrap is used to empirically derive distributions of the statistics.

In the implemented bootstrapping, residuals of the restricted model under the null hypothesis of no Granger causality are bootstrapped to generate new data under the null hypothesis. Then, the full and restricted models are re-estimated on the bootstrapped data to obtain new (bootstrapped) forecast errors.

In the current implementation, the bootstrapped p-value is calculated using Equation 4.10 in \insertCiteDavison_Hinkley_1997;textualfuntimes: p.value = (1 + n) / (B + 1), where n is the number of bootstrapped statistics smaller or equal to the observed statistic.

This function tests the Granger causation of X to Y or from Y to X (to test in both directions, need to run the function twice, with different argument cause). To use the symmetric vector autoregression (VAR), use the function causality_predVAR.

Value

A list containing the following elements:

stat

a table with the observed values of the test statistics and p-values.

cause

the cause variable.

p

the AR orders used for the dependent variable (p[1]) and for the cause variable (p[2]).

Author(s)

Vyacheslav Lyubchich

References

\insertAllCited

See Also

causality_predVAR

Examples

## Not run: 
# Example 1: Canada time series (ts object)
Canada <- vars::Canada
causality_pred(Canada[,1:2], cause = "e", lag.max = 5, p.free = TRUE)
causality_pred(Canada[,1:2], cause = "e", lag.restrict = 3, lag.max = 15, p.free = TRUE)

# Example 2 (run in parallel, initiate the cluster automatically)
# Box & Jenkins time series
# of sales and a leading indicator, see ?BJsales

D <- cbind(BJsales.lead, BJsales)
causality_pred(D, cause = "BJsales.lead", lag.max = 5, B = 1000, cl = NULL)

# Example 3 (run in parallel, initiate the cluster manually)

# Initiate a local cluster
cores <- parallel::detectCores()
cl <- parallel::makeCluster(cores)
parallel::clusterSetRNGStream(cl, 123) # to make parallel computations reproducible

causality_pred(D, cause = "BJsales.lead", lag.max = 5, B = 1000, cl = cl)
causality_pred(D, cause = "BJsales.lead", lag.restrict = 3, p = 5, B = 1000, cl = cl)
parallel::stopCluster(cl)

## End(Not run)


funtimes documentation built on March 31, 2023, 7:35 p.m.