lrv: Long Run Variance

View source: R/lrv.R

lrvR Documentation

Long Run Variance

Description

Estimates the long run variance respectively covariance matrix of the supplied time series.

Usage

lrv(x, method = c("kernel", "subsampling", "bootstrap", "none"), control = list())

Arguments

x

vector or matrix with each column representing a time series (numeric).

method

method of estimation. Options are kernel, subsampling, bootstrap and none.

control

a list of control parameters. See 'Details'.

Details

The long run variance equals the limit of n times the variance of the arithmetic mean of a short range dependent time series, where n is the length of the time series. It is used to standardize tests concering the mean on dependent data.

If method = "none", no long run variance estimation is performed and the value 1 is returned (i.e. it does not alterate the test statistic).

The control argument is a list that can supply any of the following components:

kFun

Kernel function (character string). More in 'Notes'.

b_n

Bandwidth (numeric > 0 and smaller than sample size).

gamma0

Only use estimated variance if estimated long run variance is < 0? Boolean.

l

Block length (numeric > 0 and smaller than sample size).

overlapping

Overlapping subsampling estimation? Boolean.

distr

Tranform observations by their empirical distribution function? Boolean. Default is FALSE.

B

Bootstrap repetitions (integer).

seed

RNG seed (numeric).

version

What property does the CUSUM test test for? Character string, details below.

loc

Estimated location corresponding to version. Numeric value, details below.

scale

Estimated scale corresponding to version. Numeric value, details below.

Kernel-based estimation

The kernel-based long run variance estimation is available for various testing scenarios (set by control$version) and both for one- and multi-dimensional data. It uses the bandwidth b_n = control$b_n and kernel function k(x) = control$kFun. For tests on certain properties also a corresponding location control$loc (m_n) and scale control$scale (v_n) estimation needs to be supplied. Supported testing scenarios are:

  • "mean"

    • 1-dim. data:

      \hat{σ}^2 = \frac{1}{n} ∑_{i = 1}^n (x_i - \bar{x})^2 + \frac{2}{n} ∑_{h = 1}^{b_n} ∑_{i = 1}^{n - h} (x_i - \bar{x}) (x_{i + h} - \bar{x}) k(h / b_n).

      If control$distr = TRUE, then the long run variance is estimated on the empirical distribution of x. The resulting value is then multiplied with √{π} / 2.

      Default values: b_n = 0.9 n^{1/3}, kFun = "bartlett".

    • multivariate time series: The k,l-element of Σ is estimated by

      \hat{Σ}^{(k,l)} = \frac{1}{n} ∑_{i,j = 1}^{n}(x_i^{(k)} - \bar{x}^{(k)}) (x_j^{(l)} - \bar{x}^{(l)}) k((i-j) / b_n),

      k, l = 1, ..., m.

      Default values: b_n = \log_{1.8 + m / 40}(n / 50), kFun = "bartlett".

  • "empVar" for tests on changes in the empirical variance.

    \hat{σ}^2 = ∑_{h = -(n-1)}^{n-1} W ≤ft( \frac{|h|}{b_n} \right) \frac{1}{n} ∑_{i = 1}^{n - |h|} ((x_i - m_n)^2 - v_n)((x_{i+|h|} - m_n)^2 - v_n).

    Default values: m_n = mean(x), v_n = var(x).

  • "MD" for tests on a change in the median deviation.

    \hat{σ}^2 = ∑_{h = -(n-1)}^{n-1} W ≤ft( \frac{|h|}{b_n} \right) \frac{1}{n} ∑_{i = 1}^{n - |h|} (|x_i - m_n| - v_n)(|x_{i+|h|} - m_n| - v_n).

    Default values: m_n = median(x), v_n = \frac{1}{n-1} ∑_{i = 1}^n |x_i - m_n|.

  • "GMD" for tests on changes in Gini's mean difference.

    \hat{σ}^2 = 4 ∑_{h = -(n-1)}^{n-1} W ≤ft( \frac{|h|}{b_n} \right) \frac{1}{n} ∑_{i = 1}^{n - |h|} \hat{φ}_n(x_i)\hat{φ}_n(x_{i+|h|})

    with \hat{φ}_n(x) = n^{-1} ∑_{i = 1}^n |x - x_i| - v_n.

    Default value: v_n = \frac{2}{n(n-1)} ∑_{1 ≤q i < j ≤q n} |x_i - x_j|.

  • "Qalpha" for tests on changes in Qalpha.

    \hat{σ}^2 = \frac{4}{\hat{u}(v_n)} ∑_{h = -(n-1)}^{n-1} W ≤ft( \frac{|h|}{b_n} \right) \frac{1}{n} ∑_{i = 1}^{n - |h|} \hat{φ}_n(x_i)\hat{φ}_n(x_{i+|h|}),

    where \hat{φ}_n(x) = n^{-1} ∑_{i = 1}^n 1_{\{|x - x_i| ≤q v_n\}} - m_n and

    \hat{u}(t) = \frac{2}{n(n-1)h_n} ∑_{1 ≤q i < j ≤q n} K≤ft(\frac{|x_i - x_j| - t}{h_n}\right)

    the kernel density estimation of the densitiy u corresponding to the distribution function U(t) = P(|X-Y| ≤q t), h_n = IQR(x)n^{-\frac{1}{3}} and K is the quatratic kernel function.

    Default values: m_n = α = 0.5, v_n = Qalpha(x, m_n)[n-1].

  • "tau" for tests in changes in Kendall's tau.

    Only available for bivariate data: assume that the given data x has the format (x_i, y_i)_{i = 1, ..., n}.

    \hat{σ}^2 = ∑_{h = -(n-1)}^{n-1} W ≤ft( \frac{|h|}{b_n} \right) \frac{1}{n} ∑_{i = 1}^{n - |h|} \hat{φ}_n((x_i, y_i))\hat{φ}_n((x_{i+|h|}, y_{i+|h|}),

    where \hat{φ}_n(x) = 4 F_n(x, y) - 2F_{X,n}(x) 2 - F_{Y,n}(y) + 1 - v_n and F_n, F_{X,n} and F_{Y,n} are the empirical distribution functions of ((X_i, Y_i))_{i = 1, ..., n}, (X_i)_{i = 1, ..., n} and (Y_i)_{i = 1, ..., n}.

    Default value: v_n = \hat{τ}_n = \frac{2}{n(n-1)} ∑_{1 ≤q i < j ≤q n} sign≤ft((x_j - x_i)(y_j - y_i)\right).

  • "rho" for tests on changes in Spearman's rho.

    Only availabe for d-variate data with d > 1: assume that the given data x has the format (x_{i,j} | i = 1, ..., n; j = 1, ..., d).

    \hat{σ}^2 = a(d)^2 2^{2d} ≤ft\{ ∑_{h = -(n-1)}^{n-1} K≤ft( \frac{|h|}{b_n} \right) ≤ft( ∑_{i = 1}^{n-|h|} n^{-1} ∏_{j = 1}^d \hat{φ}_n(x_i, x_j) \hat{φ}_n(x_{i+|h|}, x_j) - M^2 \right) \right\} ,

    where a(d) = (d+1) / (2^d - d - 1), M = n^{-1} ∑_{i = 1}^n ∏_{j = 1}^d \hat{φ}_n(x_i, x_j) and \hat{φ}_n(x, y) = 1 - \hat{U}_n(x, y), \hat{U}_n(x, y) = n^{-1} (rank of x_{i,j} in x_{i,1}, ..., x_{i,n}).

When control$gamma0 = TRUE (default) then negative estimates of the long run variance are replaced by the autocovariance at lag 0 (= ordinary variance of the data). The function will then throw a warning.

Subsampling estimation

For method = "subsampling" there are an overlapping and a non-overlapping version (parameter control$overlapping). Also it can be specified if the observations x were transformed by their empirical distribution function \tilde{F}_n (parameter control$distr). Via control$l the block length l can be controlled.

If control$overlapping = TRUE and control$distr = TRUE:

\hat{σ}_n = \frac{√{π}}{√{2l}(n - l + 1)} ∑_{i = 0}^{n-l} ≤ft| ∑_{j = i+1}^{i+l} (F_n(x_j) - 0.5) \right|.

Otherwise, if control$distr = FALSE, the estimator is

\hat{σ}^2 = \frac{1}{l (n - l + 1)} ∑_{i = 0}^{n-l} ≤ft( ∑_{j = i + 1}^{i+l} x_j - \frac{l}{n} ∑_{j = 1}^n x_j \right)^2.

If control$overlapping = FALSE and control$distr = TRUE:

\hat{σ} = \frac{1}{n/l} √{π/2} ∑_{i = 1}{n/l} \frac{1}{√{l}} ≤ft| ∑_{j = (i-1)l + 1}^{il} F_n(x_j) - \frac{l}{n} ∑_{j = 1}^n F_n(x_j) \right|.

Otherwise, if control$distr = FALSE, the estimator is

\hat{σ}^2 = \frac{1}{n/l} ∑_{i = 1}^{n/l} \frac{1}{l} ≤ft(∑_{j = (i-1)l + 1}^{il} x_j - \frac{l}{n} ∑_{j = 1}^n x_j\right)^2.

Default values: overlapping = TRUE, the block length is chosen adaptively:

l_n = \max{≤ft\{ ≤ft\lceil n^{1/3} ≤ft( \frac{2 ρ}{1 - ρ^2} \right)^{(2/3)} \right\rceil, 1 \right\}}

where ρ is the Spearman autocorrelation at lag 1.

Bootstrap estimation

If method = "bootstrap" a dependent wild bootstrap with the parameters B = control$B, l = control$l and k(x) = control$kFun is performed:

\hat{σ}^2 = √{n} Var(\bar{x^*_k} - \bar{x}), k = 1, ..., B

A single x_{ik}^* is generated by x_i^* = \bar{x} + (x_i - \bar{x}) a_i where a_i are independent from the data x and are generated from a multivariate normal distribution with E(A_i) = 0, Var(A_i) = 1 and Cov(A_i, A_j) = k≤ft(\frac{i - j}{l}\right), i = 1, ..., n; j \neq i. Via control$seed a seed can optionally be specified (cf. set.seed). Only "bartlett", "parzen" and "QS" are supported as kernel functions. Uses the function sqrtm from package pracma.

Default values: B = 1000, kFun = "bartlett", l is the same as for subsampling.

Value

long run variance σ^2 (numeric) resp. Σ (numeric matrix)

Note

Kernel functions

bartlett:

k(x) = (1 - |x|) * 1_{\{|x| < 1\}}

FT:

k(x) = 1 * 1_{\{|x| ≤q 0.5\}} + (2 - 2 * |x|) * 1_{\{0.5 < |x| < 1\}}

parzen:

k(x) = (1 - 6x^2 + 6|x|^3) * 1_{\{0 ≤q |x| ≤q 0.5\}} + 2(1 - |x|)^3 * 1_{\{0.5 < |x| ≤q 1\}}

QS:

k(x) = \frac{25}{12 π ^2 x^2} ≤ft(\frac{\sin(6π x / 5)}{6π x / 5} - \cos(6 π x / 5)\right)

TH:

k(x) = (1 + \cos(π x)) / 2 * 1_{\{|x| < 1\}}

truncated:

k(x) = 1_{\{|x| < 1\}}

SFT:

k(x) = (1 - 4(|x| - 0.5)^2)^2 * 1_{\{|x| < 1\}}

Epanechnikov:

k(x) = 3 \frac{1 - x^2}{4} * 1_{\{|x| < 1\}}

quatratic:

k(x) = (1 - x^2)^2 * 1_{\{|x| < 1\}}

Author(s)

Sheila Görz

References

Andrews, D.W. "Heteroskedasticity and autocorrelation consistent covariance matrix estimation." Econometrica: Journal of the Econometric Society (1991): 817-858.

Dehling, H., et al. "Change-point detection under dependence based on two-sample U-statistics." Asymptotic laws and methods in stochastics. Springer, New York, NY, (2015). 195-220.

Dehling, H., Fried, R., and Wendler, M. "A robust method for shift detection in time series." Biometrika 107.3 (2020): 647-660.

Parzen, E. "On consistent estimates of the spectrum of a stationary time series." The Annals of Mathematical Statistics (1957): 329-348.

Shao, X. "The dependent wild bootstrap." Journal of the American Statistical Association 105.489 (2010): 218-235.

See Also

CUSUM, HodgesLehmann, wilcox_stat

Examples

Z <- c(rnorm(20), rnorm(20, 2))

## kernel density estimation
lrv(Z)

## overlapping subsampling
lrv(Z, method = "subsampling", control = list(overlapping = FALSE, distr = TRUE, l = 5))

## dependent wild bootstrap
lrv(Z, method = "bootstrap", control = list(B = 2000, l = 4, kFun = "parzen"))

robcp documentation built on Sept. 16, 2022, 5:05 p.m.

Related to lrv in robcp...