| lrv | R Documentation |
Estimates the long run variance respectively covariance matrix of the supplied time series.
lrv(x, method = c("kernel", "subsampling", "bootstrap", "none"), control = list())
x |
vector or matrix with each column representing a time series (numeric). |
method |
method of estimation. Options are |
control |
a list of control parameters. See 'Details'. |
The long run variance equals the limit of n times the variance of the arithmetic mean of a short range dependent time series, where n is the length of the time series. It is used to standardize tests concering the mean on dependent data.
If method = "none", no long run variance estimation is performed and the value 1 is returned (i.e. it does not alterate the test statistic).
The control argument is a list that can supply any of the following components:
kFunKernel function (character string). More in 'Notes'.
b_nBandwidth (numeric > 0 and smaller than sample size).
gamma0Only use estimated variance if estimated long run variance is < 0? Boolean.
lBlock length (numeric > 0 and smaller than sample size).
overlappingOverlapping subsampling estimation? Boolean.
distrTranform observations by their empirical distribution function? Boolean. Default is FALSE.
BBootstrap repetitions (integer).
seedRNG seed (numeric).
versionWhat property does the CUSUM test test for? Character string, details below.
locEstimated location corresponding to version. Numeric value, details below.
scaleEstimated scale corresponding to version. Numeric value, details below.
Kernel-based estimation
The kernel-based long run variance estimation is available for various testing scenarios (set by control$version) and both for one- and multi-dimensional data. It uses the bandwidth b_n = control$b_n and kernel function k(x) = control$kFun. For tests on certain properties also a corresponding location control$loc (m_n) and scale control$scale (v_n) estimation needs to be supplied. Supported testing scenarios are:
"mean"
1-dim. data:
\hat{\sigma}^2 = \frac{1}{n} \sum_{i = 1}^n (x_i - \bar{x})^2 + \frac{2}{n} \sum_{h = 1}^{b_n} \sum_{i = 1}^{n - h} (x_i - \bar{x}) (x_{i + h} - \bar{x}) k(h / b_n).
If control$distr = TRUE, then the long run variance is estimated on the empirical distribution of x. The resulting value is then multiplied with \sqrt{\pi} / 2.
Default values: b_n = 0.9 n^{1/3}, kFun = "bartlett".
multivariate time series:
The k,l-element of \Sigma is estimated by
\hat{\Sigma}^{(k,l)} = \frac{1}{n} \sum_{i,j = 1}^{n}(x_i^{(k)} - \bar{x}^{(k)}) (x_j^{(l)} - \bar{x}^{(l)}) k((i-j) / b_n),
k, l = 1, ..., m.
Default values: b_n = \log_{1.8 + m / 40}(n / 50), kFun = "bartlett".
"empVar" for tests on changes in the empirical variance.
\hat{\sigma}^2 = \sum_{h = -(n-1)}^{n-1} W \left( \frac{|h|}{b_n} \right) \frac{1}{n} \sum_{i = 1}^{n - |h|} ((x_i - m_n)^2 - v_n)((x_{i+|h|} - m_n)^2 - v_n).
Default values: m_n = mean(x), v_n = var(x).
"MD" for tests on a change in the median deviation.
\hat{\sigma}^2 = \sum_{h = -(n-1)}^{n-1} W \left( \frac{|h|}{b_n} \right) \frac{1}{n} \sum_{i = 1}^{n - |h|} (|x_i - m_n| - v_n)(|x_{i+|h|} - m_n| - v_n).
Default values: m_n = median(x), v_n = \frac{1}{n-1} \sum_{i = 1}^n |x_i - m_n|.
"GMD" for tests on changes in Gini's mean difference.
\hat{\sigma}^2 = 4 \sum_{h = -(n-1)}^{n-1} W \left( \frac{|h|}{b_n} \right) \frac{1}{n} \sum_{i = 1}^{n - |h|} \hat{\phi}_n(x_i)\hat{\phi}_n(x_{i+|h|})
with \hat{\phi}_n(x) = n^{-1} \sum_{i = 1}^n |x - x_i| - v_n.
Default value: v_n = \frac{2}{n(n-1)} \sum_{1 \leq i < j \leq n} |x_i - x_j|.
"Qalpha" for tests on changes in Qalpha.
\hat{\sigma}^2 = \frac{4}{\hat{u}(v_n)} \sum_{h = -(n-1)}^{n-1} W \left( \frac{|h|}{b_n} \right) \frac{1}{n} \sum_{i = 1}^{n - |h|} \hat{\phi}_n(x_i)\hat{\phi}_n(x_{i+|h|}),
where \hat{\phi}_n(x) = n^{-1} \sum_{i = 1}^n 1_{\{|x - x_i| \leq v_n\}} - m_n and
\hat{u}(t) = \frac{2}{n(n-1)h_n} \sum_{1 \leq i < j \leq n} K\left(\frac{|x_i - x_j| - t}{h_n}\right)
the kernel density estimation of the densitiy u corresponding to the distribution function U(t) = P(|X-Y| \leq t), h_n = IQR(x)n^{-\frac{1}{3}} and K is the quatratic kernel function.
Default values: m_n = \alpha = 0.5, v_n = Qalpha(x, m_n)[n-1].
"tau" for tests in changes in Kendall's tau.
Only available for bivariate data: assume that the given data x has the format (x_i, y_i)_{i = 1, ..., n}.
\hat{\sigma}^2 = \sum_{h = -(n-1)}^{n-1} W \left( \frac{|h|}{b_n} \right) \frac{1}{n} \sum_{i = 1}^{n - |h|} \hat{\phi}_n((x_i, y_i))\hat{\phi}_n((x_{i+|h|}, y_{i+|h|}),
where \hat{\phi}_n(x) = 4 F_n(x, y) - 2F_{X,n}(x) 2 - F_{Y,n}(y) + 1 - v_n and F_n, F_{X,n} and F_{Y,n} are the empirical distribution functions of ((X_i, Y_i))_{i = 1, ..., n}, (X_i)_{i = 1, ..., n} and (Y_i)_{i = 1, ..., n}.
Default value: v_n = \hat{\tau}_n = \frac{2}{n(n-1)} \sum_{1 \leq i < j \leq n} sign\left((x_j - x_i)(y_j - y_i)\right).
"rho" for tests on changes in Spearman's rho.
Only availabe for d-variate data with d > 1: assume that the given data x has the format (x_{i,j} | i = 1, ..., n; j = 1, ..., d).
\hat{\sigma}^2 = a(d)^2 2^{2d} \left\{ \sum_{h = -(n-1)}^{n-1} K\left( \frac{|h|}{b_n} \right) \left( \sum_{i = 1}^{n-|h|} n^{-1} \prod_{j = 1}^d \hat{\phi}_n(x_i, x_j) \hat{\phi}_n(x_{i+|h|}, x_j) - M^2 \right) \right\} ,
where a(d) = (d+1) / (2^d - d - 1), M = n^{-1} \sum_{i = 1}^n \prod_{j = 1}^d \hat{\phi}_n(x_i, x_j) and \hat{\phi}_n(x, y) = 1 - \hat{U}_n(x, y), \hat{U}_n(x, y) = n^{-1} (rank of x_{i,j} in x_{i,1}, ..., x_{i,n}).
When control$gamma0 = TRUE (default) then negative estimates of the long run variance are replaced by the autocovariance at lag 0 (= ordinary variance of the data). The function will then throw a warning.
Subsampling estimation
For method = "subsampling" there are an overlapping and a non-overlapping version (parameter control$overlapping). Also it can be specified if the observations x were transformed by their empirical distribution function \tilde{F}_n (parameter control$distr). Via control$l the block length l can be controlled.
If control$overlapping = TRUE and control$distr = TRUE:
\hat{\sigma}_n = \frac{\sqrt{\pi}}{\sqrt{2l}(n - l + 1)} \sum_{i = 0}^{n-l} \left| \sum_{j = i+1}^{i+l} (F_n(x_j) - 0.5) \right|.
Otherwise, if control$distr = FALSE, the estimator is
\hat{\sigma}^2 = \frac{1}{l (n - l + 1)} \sum_{i = 0}^{n-l} \left( \sum_{j = i + 1}^{i+l} x_j - \frac{l}{n} \sum_{j = 1}^n x_j \right)^2.
If control$overlapping = FALSE and control$distr = TRUE:
\hat{\sigma} = \frac{1}{n/l} \sqrt{\pi/2} \sum_{i = 1}{n/l} \frac{1}{\sqrt{l}} \left| \sum_{j = (i-1)l + 1}^{il} F_n(x_j) - \frac{l}{n} \sum_{j = 1}^n F_n(x_j) \right|.
Otherwise, if control$distr = FALSE, the estimator is
\hat{\sigma}^2 = \frac{1}{n/l} \sum_{i = 1}^{n/l} \frac{1}{l} \left(\sum_{j = (i-1)l + 1}^{il} x_j - \frac{l}{n} \sum_{j = 1}^n x_j\right)^2.
Default values: overlapping = TRUE, the block length is chosen adaptively:
l_n = \max{\left\{ \left\lceil n^{1/3} \left( \frac{2 \rho}{1 - \rho^2} \right)^{(2/3)} \right\rceil, 1 \right\}}
where \rho is the Spearman autocorrelation at lag 1.
Bootstrap estimation
If method = "bootstrap" a dependent wild bootstrap with the parameters B = control$B, l = control$l and k(x) = control$kFun is performed:
\hat{\sigma}^2 = \sqrt{n} Var(\bar{x^*_k} - \bar{x}), k = 1, ..., B
A single x_{ik}^* is generated by x_i^* = \bar{x} + (x_i - \bar{x}) a_i where a_i are independent from the data x and are generated from a multivariate normal distribution with E(A_i) = 0, Var(A_i) = 1 and Cov(A_i, A_j) = k\left(\frac{i - j}{l}\right), i = 1, ..., n; j \neq i. Via control$seed a seed can optionally be specified (cf. set.seed). Only "bartlett", "parzen" and "QS" are supported as kernel functions. Uses the function sqrtm from package pracma.
Default values: B = 1000, kFun = "bartlett", l is the same as for subsampling.
long run variance \sigma^2 (numeric) resp. \Sigma (numeric matrix)
Kernel functions
bartlett:
k(x) = (1 - |x|) * 1_{\{|x| < 1\}}
FT:
k(x) = 1 * 1_{\{|x| \leq 0.5\}} + (2 - 2 * |x|) * 1_{\{0.5 < |x| < 1\}}
parzen:
k(x) = (1 - 6x^2 + 6|x|^3) * 1_{\{0 \leq |x| \leq 0.5\}} + 2(1 - |x|)^3 * 1_{\{0.5 < |x| \leq 1\}}
QS:
k(x) = \frac{25}{12 \pi ^2 x^2} \left(\frac{\sin(6\pi x / 5)}{6\pi x / 5} - \cos(6 \pi x / 5)\right)
TH:
k(x) = (1 + \cos(\pi x)) / 2 * 1_{\{|x| < 1\}}
truncated:
k(x) = 1_{\{|x| < 1\}}
SFT:
k(x) = (1 - 4(|x| - 0.5)^2)^2 * 1_{\{|x| < 1\}}
Epanechnikov:
k(x) = 3 \frac{1 - x^2}{4} * 1_{\{|x| < 1\}}
quatratic:
k(x) = (1 - x^2)^2 * 1_{\{|x| < 1\}}
Sheila Görz
Andrews, D.W. "Heteroskedasticity and autocorrelation consistent covariance matrix estimation." Econometrica: Journal of the Econometric Society (1991): 817-858.
Dehling, H., et al. "Change-point detection under dependence based on two-sample U-statistics." Asymptotic laws and methods in stochastics. Springer, New York, NY, (2015). 195-220.
Dehling, H., Fried, R., and Wendler, M. "A robust method for shift detection in time series." Biometrika 107.3 (2020): 647-660.
Parzen, E. "On consistent estimates of the spectrum of a stationary time series." The Annals of Mathematical Statistics (1957): 329-348.
Shao, X. "The dependent wild bootstrap." Journal of the American Statistical Association 105.489 (2010): 218-235.
CUSUM, HodgesLehmann, wilcox_stat
Z <- c(rnorm(20), rnorm(20, 2))
## kernel density estimation
lrv(Z)
## overlapping subsampling
lrv(Z, method = "subsampling", control = list(overlapping = FALSE, distr = TRUE, l_n = 5))
## dependent wild bootstrap estimation
lrv(Z, method = "bootstrap", control = list(l_n = 5, kFun = "parzen"))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.