knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
memochange package can be used for two things: Checking for a break in persistence and checking for a change in mean. This vignette presents the functions related to a break in persistence. This includes
pb_sim. Before considering the usage of these functions, a brief literature review elaborates on their connection.
The degree of memory is an important determinant of the characteristics of a time series. For an $I(0)$, or short-memory, process (e.g., AR(1) or ARMA(1,1)), the impact of shocks is short-lived and dies out quickly. On the other hand, for an $I(1)$, or difference-stationary, process such as the random walk, shocks persist infinitely. Thus, any change in a variable will have an impact on all future realizations. For an $I(d)$, or long-memory, process with $0<d<1$, shocks neither die out quickly nor persist infinitely, but have a hyperbolically decaying impact. In this case, the current value of a variable depends on past shocks, but the less so the further these shocks are past.
There are plenty of procedures to determine the memory of a series (see @robinson1995gaussian, @shimotsu2010exact, among others). However, there is also the possibility that series exhibit a structural change in memory, often referred to as a change in persistence. Starting with @kim2000detection various procedures have been proposed to detect these changes and consistently estimate the change point. @busetti2004tests and @leybourne2004tests suggest approaches for testing the null of constant $I(0)$ behaviour of the series against the alternative that a change from either $I(0)$ to $I(1)$ or $I(1)$ to $I(0)$ occurred. However, both approaches show serious distortions if neither the null nor the alternative is true, e.g. the series is constant $I(1)$. In this case the procedures by @leybourne2003tests and @leybourne2007cusum can be applied as they have the same alternative, but assume constant $I(1)$ behaviour under the null. Again, the procedures exhibit distortions when neither the null nor the alternative is true. To remedy this issue, @harvey2006modified suggest an approach that entails the same critical values for constant $I(0)$ and constant $I(1)$ behavior. Consequently, it accommodates both, constant $I(0)$ and constant $I(1)$ behavior under the null.
While this earlier work focussed on the $I(0)/I(1)$ framework, more recent approaches are able to detect changes from $I(d_1)$ to $I(d_2)$ where $d_1$ and $d_2$ are allowed to be non-integers. @sibbertsen2009testing extend the approach of @leybourne2007cusum such that the testing procedure consistently detects changes from $0 \leq d_1<1/2$ to $1/2<d_2<3/2$ and vice versa. Under the null the test assumes constant $I(d)$ behavior with $0 \leq d <3/2$. The approach suggested by @martins2014testing is even able to identify changes from $-1/2<d_1<2$ to $-1/2<d_2<2$ with $d_1 \neq d_2$. Here, under the null the test assumes constant $I(d)$ behavior with $-1/2<d<2$.
Examples for series that potentially exhibit breaks in persistence are macroeconomic and financial time series such as inflation rates, trading volume, interest rates, volatilities and so on. For these series it is therefore strongly recommended to investigate the possibility of a break in persistence before modeling and forecasting the series.
memochange package contains all procedure mentioned above to identify whether a time series exhibits a break in persistence mentioned above. Additionally, several estimators are implemented which consistently estimate the point at which the series exhibits a break in persistence and the order of integration in the two regimes.
We will now show how the usage of the implemented procedures while investigating the price of crude oil.
First, we download the monthly price series from the FRED data base.
To get a first visual impression, we plot the series.
oil=as.data.frame(oil) oil$DATE=zoo::as.Date(oil$DATE) oil_xts=xts::xts(oil[,-1],order.by = oil$DATE) zoo::plot.zoo(oil_xts, xlab="", ylab="Price", main="Crude Oil Price: West Texas Intermediate")
From the plot we observe that the series seems to be more variable in its second part from year 2000 onwards. This is first evidence that a change in persistence has occurred. We can test this hypothesis using the functions
cusum_test (@leybourne2007cusum, @sibbertsen2009testing)
MR_test (@martins2014testing) , and
ratio_test (@busetti2004tests, @leybourne2004tests, @harvey2006modified). In this vignette we use the ratio and MR test since these are the empirically most often applied ones. The functionality of the other tests is similar. They all require a univariate numeric vector
x as an input variable and yield a matrix of test statistic and critical values as an output variable.
library(memochange) x <- as.numeric(oil[,2])
As a starting point the default version of the ratio test is applied.
This yields a matrix that gives test statistic and critical values for the null of constant $I(0)$ against a change from $I(0)$ to $I(1)$ or vice versa. Furthermore, the statistics for a change in an unknown direction are included as well. This accounts for the fact that we perform two tests facing a multiple testing problem. The results suggest that a change from $I(0)$ to $I(1)$ has occurred somewhere in the series since the test statistic exceeds the critical value at the one percent level. In addition, this value is also significant when accounting for the multiple testing problem. Consequently, the default version of the ratio test suggests a break in persistence.
We can modify this default version by choosing the arguments
M (see the help page of the ratio test for details).
The plot does not indicate a linear trend so that it seems unreasonable to change the trend argument. Also, the plot suggests that the break is rather in the middle of the series than at the beginning or the end so that changing
tau seems unnecessary as well.
The type of test statistic calculated can be easily changed using the statistic argument. However, simulation results indicate mean, max, and exp statistics to deliver qualitatively similar results.
Something that is of more importance is the type of test performed. The default version considers the approach by Busetti and Taylor (2004). In case of a constant $I(1)$ process this test often spuriously identifies a break in persistence. Harvey, Leybourne and Taylor (2006) account for this issue by adjusting the test statistic such that its critical values are the same under constant $I(0)$ and constant $I(1)$. We can calculate their test statistic by setting
type="HLT". For this purpose, we need to state the number of polynomials
z used in their test statistic. The default value is 9 as suggested by Harvey, Leybourne and Taylor (2006).
Choosing another value is only sensible for very large data sets (number of obs. > 10000) where the test statistic cannot be calculated due to computational singularity.
In this case decreasing
z can allow the test statistic to be calculated. This invalidates the critical values so that we would have to simulate them by setting
However, as our data set is rather small we can stick with the default of
Again the test results suggests that there is a break from $I(0)$ to $I(1)$. Consequently, it is not a constant $I(1)$ process that led to a spurious rejection of the test by Busetti and Taylor (2004).
Another test for a change in persistence is that by Martins and Rodrigues (2014). This is more general as it is not restricted to the $I(0)/I(1)$ framework, but can identify changes from $I(d_1)$ to $I(d_2)$ with $d_1 \neq d_2$ and $-1/2<d_1,d_2<2$. The default version is applied by
Again, the function returns a matrix consisting of test statistic and critical values. Here, the alternative of the test is an increase respectively a decrease in memory. In line with the results of the ratio test, the approach by Martins and Rodrigues (2014) suggests that the series exhibits an increase in memory, i.e. that the memory of the series increases from $d_1$ to $d_2$ with $d_1<d_2$ at some point in time. Again, this also holds if we consider the critical values that account for the multiple testing problem.
Similar to the ratio test and all other tests against a change in persistence in the
memochange package, the MR test also has the same arguments
M. Furthermore, we can choose again the type of test statistic. This time we can decide whether to use the squared t-statistic or the standard t-statistic.
As for the ratio test, changing the type of statistic has a rather small effect on the empirical performance of the test.
If we believe that the underlying process exhibits additional short run components, we can account for these by setting
While the test statistic changes, the conclusion remains the same.
All tests indicate that the oil price series exhibits an increase in memory over time. To correctly model and forecast the series, the exact location of the break is important.
This can be estimated by the
BP_estim function. It is important for the function that the direction of the change is correctly specified. In our case, an increase in memory has occurred so that we set
This yields a list stating the location of the break (observation 151), semiparametric estimates of the order of integration in the two regimes (0.86 and 1.03) as well as the standard deviations of these estimates (0.13 and 0.15).
Consequently, the function indicates that there is a break in persistence in July, 1998. This means that from the beginning of the sample until June 1998 the series is integrated with an order of 0.85 and from July 1998 on the order of integration increased to 1.03.
As before, the function allows for various types of break point estimators. Instead of the default estimator of Busetti and Taylor (2004), one can also rely on the estimator of Leybourne, Kim, and Taylor (2007) by setting
This estimator relies on estimates of the long-run variance. Therefore, it is also needed that
m is chosen, which determines how many covariances are used when estimating the long-run variance. Leybourne, Kim, and Taylor (2007) suggest
BP_estim(x, direction="01", type="LKT", m=0)
This yields a similar result with the break point lying in the year 1998 and d increasing from approximately 0.8 to approximately 1.
All other arguments of the function (
serial) were already discussed above except for
d_bw. These two arguments determine which estimator and bandwidth are used to estimate the order of integration in the two regimes. Concerning the estimator, the GPH (Geweke and Porter-Hudak (1983)) and the exact local Whittle estimator (Shimotsu and Phillips (2005)) can be selected. Although the exact local Whittle estimator has a lower variance, the GPH estimator is still often considered in empirical applications due to its simplicity. In our example the results of the two estimators are almost identical.
BP_estim(x, direction="01", d_estim="GPH")
d_bw argument determines how many frequencies are used for estimation. Larger values imply a lower variance of the estimates, but also bias the estimator if the underlying process possesses short run dynamics.
Usually a value between 0.5 and 0.8 is considered.
BP_estim(x, direction="01", d_bw=0.75) BP_estim(x, direction="01", d_bw=0.65)
In our setup, it can be seen that increasing
d_bw to 0.75 does not severely change the estimated order of integration in the two regimes. Decreasing
d_bw, however, leads to smaller estimates of $d$.
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.