ccf_boot: Cross-Correlation of Autocorrelated Time Series

View source: R/ccf_boot.R

ccf_bootR Documentation

Cross-Correlation of Autocorrelated Time Series

Description

Account for possible autocorrelation of time series when assessing the statistical significance of their cross-correlation. A sieve bootstrap approach is used to generate multiple copies of the time series with the same autoregressive dependence, under the null hypothesis of the two time series under investigation being uncorrelated. The significance of cross-correlation coefficients is assessed based on the distribution of their bootstrapped counterparts. Both Pearson and Spearman types of coefficients are obtained, but a plot is provided for only one type, with significant correlations shown using filled circles (see Examples).

Usage

ccf_boot(
  x,
  y,
  lag.max = NULL,
  plot = c("Pearson", "Spearman", "none"),
  level = 0.95,
  B = 1000,
  smooth = FALSE,
  cl = 1L,
  ...
)

Arguments

x, y

univariate numeric time-series objects or numeric vectors for which to compute cross-correlation. Different time attributes in ts objects are acknowledged, see Example 2 below.

lag.max

maximum lag at which to calculate the cross-correlation. Will be automatically limited as in ccf.

plot

choose whether to plot results for Pearson correlation (default, or use plot = "Pearson"), Spearman correlation (use plot = "Spearman"), or suppress plotting (use plot = "none"). Both Pearson's and Spearman's results are given in the output, regardless of the plot setting.

level

confidence level, from 0 to 1. Default is 0.95, that is, 95% confidence.

B

number of bootstrap simulations to obtain empirical critical values. Default is 1000.

smooth

logical value indicating whether the bootstrap confidence bands should be smoothed across lags. Default is FALSE meaning no smoothing.

cl

parameter to specify computer cluster for bootstrapping passed to the package parallel (default cl = 1, means no cluster is used). Possible values are:

  • cluster object (list) produced by makeCluster. In this case, a new cluster is not started nor stopped;

  • NULL. In this case, the function will detect available cores (see detectCores) and, if there are multiple cores (>1), a cluster will be started with makeCluster. If started, the cluster will be stopped after the computations are finished;

  • positive integer defining the number of cores to start a cluster. If cl = 1 (default), no attempt to create a cluster will be made. If cl > 1, a cluster will be started (using makeCluster) and stopped afterward (using stopCluster).

...

other parameters passed to the function ARest to control how autoregressive dependencies are estimated. The same set of parameters is used separately on x and y.

Details

Note that the smoothing of confidence bands is implemented purely for the look. This smoothing is different from the smoothing methods that can be applied to adjust bootstrap performance \insertCiteDeAngelis_Young_1992funtimes. For correlations close to the significance bounds, the setting of smooth might affect the decision on the statistical significance. In this case, it is recommended to keep smooth = FALSE and set a higher B.

Value

A data frame with the following columns:

Lag

lags for which the following values were obtained.

r_P

observed Pearson correlations.

lower_P, upper_P

lower and upper confidence bounds (for the confidence level set by level) for Pearson correlations.

r_S

observed Spearman correlations.

lower_S, upper_S

lower and upper confidence bounds (for the confidence level set by level) for Spearman correlations.

Author(s)

Vyacheslav Lyubchich

See Also

ARest, ar, ccf, HVK

Examples

## Not run: 
# Fix seed for reproducible simulations:
set.seed(1)

# Example 1
# Simulate independent normal time series of same lengths
x <- rnorm(100)
y <- rnorm(100)
# Default CCF with parametric confidence band
ccf(x, y)
# CCF with bootstrap
tmp <- ccf_boot(x, y)
# One can extract results for both Pearson and Spearman correlations
tmp$rP
tmp$rS

# Example 2
# Simulated ts objects of different lengths and starts (incomplete overlap)
x <- arima.sim(list(order = c(1, 0, 0), ar = 0.5), n = 30)
x <- ts(x, start = 2001)
y <- arima.sim(list(order = c(2, 0, 0), ar = c(0.5, 0.2)), n = 40)
y <- ts(y, start = 2020)
# Show how x and y are aligned
ts.plot(x, y, col = 1:2, lty = 1:2)
# The usual CCF
ccf(x, y)
# CCF with bootstrap confidence intervals
ccf_boot(x, y, plot = "Spearman")
# Notice that only +-7 lags can be calculated in both cases because of the small
# overlap of the time series. If we save these time series as plain vectors, the time
# information would be lost, and the time series will be misaligned.
ccf(as.numeric(x), as.numeric(y))

# Example 3
# Box & Jenkins time series of sales and a leading indicator, see ?BJsales
plot.ts(cbind(BJsales.lead, BJsales))
# Each of the BJ time series looks as having a stochastic linear trend, so apply differences
plot.ts(cbind(diff(BJsales.lead), diff(BJsales)))
# Get cross-correlation of the differenced series
ccf_boot(diff(BJsales.lead), diff(BJsales), plot = "Spearman")
# The leading indicator "stands out" with significant correlations at negative lags,
# showing it can be used to predict the sales 2-3 time steps ahead (that is,
# diff(BJsales.lead) at times t-2 and t-3 is strongly correlated with diff(BJsales) at
# current time t).

## End(Not run)


vlyubchich/funtimes documentation built on May 6, 2023, 3:21 a.m.