lrv | R Documentation |
Estimates the long run variance respectively covariance matrix of the supplied time series.
lrv(x, method = c("kernel", "subsampling", "bootstrap", "none"), control = list())
x |
vector or matrix with each column representing a time series (numeric). |
method |
method of estimation. Options are |
control |
a list of control parameters. See 'Details'. |
The long run variance equals the limit of n
times the variance of the arithmetic mean of a short range dependent time series, where n
is the length of the time series. It is used to standardize tests concering the mean on dependent data.
If method = "none"
, no long run variance estimation is performed and the value 1 is returned (i.e. it does not alterate the test statistic).
The control
argument is a list that can supply any of the following components:
kFun
Kernel function (character string). More in 'Notes'.
b_n
Bandwidth (numeric > 0 and smaller than sample size).
gamma0
Only use estimated variance if estimated long run variance is < 0? Boolean.
l
Block length (numeric > 0 and smaller than sample size).
overlapping
Overlapping subsampling estimation? Boolean.
distr
Tranform observations by their empirical distribution function? Boolean. Default is FALSE
.
B
Bootstrap repetitions (integer).
seed
RNG seed (numeric).
version
What property does the CUSUM test test for? Character string, details below.
loc
Estimated location corresponding to version
. Numeric value, details below.
scale
Estimated scale corresponding to version
. Numeric value, details below.
Kernel-based estimation
The kernel-based long run variance estimation is available for various testing scenarios (set by control$version
) and both for one- and multi-dimensional data. It uses the bandwidth b_n =
control$b_n
and kernel function k(x) =
control$kFun
. For tests on certain properties also a corresponding location control$loc
(m_n
) and scale control$scale
(v_n
) estimation needs to be supplied. Supported testing scenarios are:
"mean"
1-dim. data:
\hat{\sigma}^2 = \frac{1}{n} \sum_{i = 1}^n (x_i - \bar{x})^2 + \frac{2}{n} \sum_{h = 1}^{b_n} \sum_{i = 1}^{n - h} (x_i - \bar{x}) (x_{i + h} - \bar{x}) k(h / b_n).
If control$distr = TRUE
, then the long run variance is estimated on the empirical distribution of x
. The resulting value is then multiplied with \sqrt{\pi} / 2
.
Default values: b_n
= 0.9 n^{1/3}
, kFun = "bartlett"
.
multivariate time series:
The k,l
-element of \Sigma
is estimated by
\hat{\Sigma}^{(k,l)} = \frac{1}{n} \sum_{i,j = 1}^{n}(x_i^{(k)} - \bar{x}^{(k)}) (x_j^{(l)} - \bar{x}^{(l)}) k((i-j) / b_n),
k, l = 1, ..., m
.
Default values: b_n
= \log_{1.8 + m / 40}(n / 50)
, kFun = "bartlett"
.
"empVar"
for tests on changes in the empirical variance.
\hat{\sigma}^2 = \sum_{h = -(n-1)}^{n-1} W \left( \frac{|h|}{b_n} \right) \frac{1}{n} \sum_{i = 1}^{n - |h|} ((x_i - m_n)^2 - v_n)((x_{i+|h|} - m_n)^2 - v_n).
Default values: m_n =
mean(x)
, v_n =
var(x)
.
"MD"
for tests on a change in the median deviation.
\hat{\sigma}^2 = \sum_{h = -(n-1)}^{n-1} W \left( \frac{|h|}{b_n} \right) \frac{1}{n} \sum_{i = 1}^{n - |h|} (|x_i - m_n| - v_n)(|x_{i+|h|} - m_n| - v_n).
Default values: m_n =
median(x)
, v_n = \frac{1}{n-1} \sum_{i = 1}^n |x_i - m_n|
.
"GMD"
for tests on changes in Gini's mean difference.
\hat{\sigma}^2 = 4 \sum_{h = -(n-1)}^{n-1} W \left( \frac{|h|}{b_n} \right) \frac{1}{n} \sum_{i = 1}^{n - |h|} \hat{\phi}_n(x_i)\hat{\phi}_n(x_{i+|h|})
with \hat{\phi}_n(x) = n^{-1} \sum_{i = 1}^n |x - x_i| - v_n
.
Default value: v_n =
\frac{2}{n(n-1)} \sum_{1 \leq i < j \leq n} |x_i - x_j|.
"Qalpha"
for tests on changes in Qalpha
.
\hat{\sigma}^2 = \frac{4}{\hat{u}(v_n)} \sum_{h = -(n-1)}^{n-1} W \left( \frac{|h|}{b_n} \right) \frac{1}{n} \sum_{i = 1}^{n - |h|} \hat{\phi}_n(x_i)\hat{\phi}_n(x_{i+|h|}),
where \hat{\phi}_n(x) = n^{-1} \sum_{i = 1}^n 1_{\{|x - x_i| \leq v_n\}} - m_n
and
\hat{u}(t) = \frac{2}{n(n-1)h_n} \sum_{1 \leq i < j \leq n} K\left(\frac{|x_i - x_j| - t}{h_n}\right)
the kernel density estimation of the densitiy u
corresponding to the distribution function U(t) = P(|X-Y| \leq t)
, h_n =
IQR(x)
n^{-\frac{1}{3}}
and K
is the quatratic kernel function.
Default values: m_n = \alpha = 0.5
, v_n =
Qalpha(x, m_n)[n-1]
.
"tau"
for tests in changes in Kendall's tau.
Only available for bivariate data: assume that the given data x
has the format (x_i, y_i)_{i = 1, ..., n}
.
\hat{\sigma}^2 = \sum_{h = -(n-1)}^{n-1} W \left( \frac{|h|}{b_n} \right) \frac{1}{n} \sum_{i = 1}^{n - |h|} \hat{\phi}_n((x_i, y_i))\hat{\phi}_n((x_{i+|h|}, y_{i+|h|}),
where \hat{\phi}_n(x) = 4 F_n(x, y) - 2F_{X,n}(x) 2 - F_{Y,n}(y) + 1 - v_n
and F_n
, F_{X,n}
and F_{Y,n}
are the empirical distribution functions of ((X_i, Y_i))_{i = 1, ..., n}
, (X_i)_{i = 1, ..., n}
and (Y_i)_{i = 1, ..., n}
.
Default value: v_n = \hat{\tau}_n = \frac{2}{n(n-1)} \sum_{1 \leq i < j \leq n} sign\left((x_j - x_i)(y_j - y_i)\right)
.
"rho"
for tests on changes in Spearman's rho.
Only availabe for d
-variate data with d > 1
: assume that the given data x
has the format (x_{i,j} | i = 1, ..., n; j = 1, ..., d)
.
\hat{\sigma}^2 = a(d)^2 2^{2d} \left\{ \sum_{h = -(n-1)}^{n-1} K\left( \frac{|h|}{b_n} \right) \left( \sum_{i = 1}^{n-|h|} n^{-1} \prod_{j = 1}^d \hat{\phi}_n(x_i, x_j) \hat{\phi}_n(x_{i+|h|}, x_j) - M^2 \right) \right\} ,
where a(d) = (d+1) / (2^d - d - 1)
, M = n^{-1} \sum_{i = 1}^n \prod_{j = 1}^d \hat{\phi}_n(x_i, x_j)
and \hat{\phi}_n(x, y) = 1 - \hat{U}_n(x, y)
, \hat{U}_n(x, y) = n^{-1}
(rank of x_{i,j}
in x_{i,1}, ..., x_{i,n})
.
When control$gamma0 = TRUE
(default) then negative estimates of the long run variance are replaced by the autocovariance at lag 0 (= ordinary variance of the data). The function will then throw a warning.
Subsampling estimation
For method = "subsampling"
there are an overlapping and a non-overlapping version (parameter control$overlapping
). Also it can be specified if the observations x were transformed by their empirical distribution function \tilde{F}_n
(parameter control$distr
). Via control$l
the block length l
can be controlled.
If control$overlapping = TRUE
and control$distr = TRUE
:
\hat{\sigma}_n = \frac{\sqrt{\pi}}{\sqrt{2l}(n - l + 1)} \sum_{i = 0}^{n-l} \left| \sum_{j = i+1}^{i+l} (F_n(x_j) - 0.5) \right|.
Otherwise, if control$distr = FALSE
, the estimator is
\hat{\sigma}^2 = \frac{1}{l (n - l + 1)} \sum_{i = 0}^{n-l} \left( \sum_{j = i + 1}^{i+l} x_j - \frac{l}{n} \sum_{j = 1}^n x_j \right)^2.
If control$overlapping = FALSE
and control$distr = TRUE
:
\hat{\sigma} = \frac{1}{n/l} \sqrt{\pi/2} \sum_{i = 1}{n/l} \frac{1}{\sqrt{l}} \left| \sum_{j = (i-1)l + 1}^{il} F_n(x_j) - \frac{l}{n} \sum_{j = 1}^n F_n(x_j) \right|.
Otherwise, if control$distr = FALSE
, the estimator is
\hat{\sigma}^2 = \frac{1}{n/l} \sum_{i = 1}^{n/l} \frac{1}{l} \left(\sum_{j = (i-1)l + 1}^{il} x_j - \frac{l}{n} \sum_{j = 1}^n x_j\right)^2.
Default values: overlapping = TRUE, the block length is chosen adaptively:
l_n = \max{\left\{ \left\lceil n^{1/3} \left( \frac{2 \rho}{1 - \rho^2} \right)^{(2/3)} \right\rceil, 1 \right\}}
where \rho
is the Spearman autocorrelation at lag 1.
Bootstrap estimation
If method = "bootstrap"
a dependent wild bootstrap with the parameters B =
control$B
, l =
control$l
and k(x) =
control$kFun
is performed:
\hat{\sigma}^2 = \sqrt{n} Var(\bar{x^*_k} - \bar{x}), k = 1, ..., B
A single x_{ik}^*
is generated by x_i^* = \bar{x} + (x_i - \bar{x}) a_i
where a_i
are independent from the data x
and are generated from a multivariate normal distribution with E(A_i) = 0
, Var(A_i) = 1
and Cov(A_i, A_j) = k\left(\frac{i - j}{l}\right), i = 1, ..., n; j \neq i
. Via control$seed
a seed can optionally be specified (cf. set.seed
). Only "bartlett"
, "parzen"
and "QS"
are supported as kernel functions. Uses the function sqrtm
from package pracma
.
Default values: B
= 1000, kFun = "bartlett"
, l
is the same as for subsampling.
long run variance \sigma^2
(numeric) resp. \Sigma
(numeric matrix)
Kernel functions
bartlett
:
k(x) = (1 - |x|) * 1_{\{|x| < 1\}}
FT
:
k(x) = 1 * 1_{\{|x| \leq 0.5\}} + (2 - 2 * |x|) * 1_{\{0.5 < |x| < 1\}}
parzen
:
k(x) = (1 - 6x^2 + 6|x|^3) * 1_{\{0 \leq |x| \leq 0.5\}} + 2(1 - |x|)^3 * 1_{\{0.5 < |x| \leq 1\}}
QS
:
k(x) = \frac{25}{12 \pi ^2 x^2} \left(\frac{\sin(6\pi x / 5)}{6\pi x / 5} - \cos(6 \pi x / 5)\right)
TH
:
k(x) = (1 + \cos(\pi x)) / 2 * 1_{\{|x| < 1\}}
truncated
:
k(x) = 1_{\{|x| < 1\}}
SFT
:
k(x) = (1 - 4(|x| - 0.5)^2)^2 * 1_{\{|x| < 1\}}
Epanechnikov
:
k(x) = 3 \frac{1 - x^2}{4} * 1_{\{|x| < 1\}}
quatratic
:
k(x) = (1 - x^2)^2 * 1_{\{|x| < 1\}}
Sheila Görz
Andrews, D.W. "Heteroskedasticity and autocorrelation consistent covariance matrix estimation." Econometrica: Journal of the Econometric Society (1991): 817-858.
Dehling, H., et al. "Change-point detection under dependence based on two-sample U-statistics." Asymptotic laws and methods in stochastics. Springer, New York, NY, (2015). 195-220.
Dehling, H., Fried, R., and Wendler, M. "A robust method for shift detection in time series." Biometrika 107.3 (2020): 647-660.
Parzen, E. "On consistent estimates of the spectrum of a stationary time series." The Annals of Mathematical Statistics (1957): 329-348.
Shao, X. "The dependent wild bootstrap." Journal of the American Statistical Association 105.489 (2010): 218-235.
CUSUM
, HodgesLehmann
, wilcox_stat
Z <- c(rnorm(20), rnorm(20, 2))
## kernel density estimation
lrv(Z)
## overlapping subsampling
lrv(Z, method = "subsampling", control = list(overlapping = FALSE, distr = TRUE, l_n = 5))
## dependent wild bootstrap estimation
lrv(Z, method = "bootstrap", control = list(l_n = 5, kFun = "parzen"))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.