Description Usage Arguments Details Value Disclaimer Note Author(s) References See Also Examples
Performs the twostep Engle Granger cointegration procedure on a pair of time series, and creates an object representing the results of the analysis.
1 2 3 4 5 6 7 8  egcm(X, Y, na.action, log = FALSE, normalize = FALSE,
debias = TRUE, robust=FALSE, include.const=TRUE,
i1test = egcm.default.i1test(),
urtest = egcm.default.urtest(),
p.value = egcm.default.pvalue())
is.cointegrated(E)
is.ar1(E)

X 
the first time series to be considered in the cointegration test.
A plain or 
Y 
the second time series to be considered in the cointegration test.
A plain or 
E 
an object of class 
na.action 
a function that indicates what should happen when the data contain NAs.
See 
log 
a boolean value which if 
normalize 
a boolean value which if 
debias 
a boolean value which if 
robust 
a boolean value which if 
include.const 
a boolean which if 
i1test 
a mnemonic indicating the name of the test that should be used for
checking if the input series

urtest 
a mnemonic indicating the name of the test that should be used for
checking if the residual series contains a unit root. If none is
specified, then defaults to the value reported by

p.value 
the pvalue to be used in the above tests.
If none is specified, then defaults to the value reported
by 
The twostep Engle Granger procedure searches for parameters α, β, and ρ that yield the best fit to the following model:
Y[i] = α + β * X[i] + R[i]
R[i] = ρ * R[i1] + ε[i]
ε[i] \sim N(0, σ^2)
In the first step, alpha and beta are found using
a linear fit of X[i]
with respect to Y[i]
. The
residual sequence R[i]
is then determined.
Then, in the second step, ρ is determined, again using
a linear fit.
Engle and Granger showed that if X and Y are cointegrated, then this procedure will yield consistent estimates of the parameters. However, there are several ways in which this estimation procedure can fail:
Either X
or Y
(or both) may already be
meanreverting. In this case, there is no point in forming
the difference Y  β X. If one series is meanreverting
and the other is not, then any nontrivial linear combination will
not be meanreverting.
The residual series R[i]
may not be meanreverting.
In the language of cointegration theory, it is then said to
contain a unit root. In this case, there is no benefit to
forming the linear combination Y  β X.
The residual series R[i]
may be meanreverting, but
the relation R[i] = ρ R[i1] + ε[i] may not be
the right model. In other words, the residual series
may not be adequately described by an autoregressive
series of order one. In this case, the parameters α
and β will be correct, however the specification for
the residuals R[i]
will not be. The user may wish to try
fitting the residuals using another function, such as arima
.
The egcm
function checks for each of the above contingencies,
using an appropriate statistical test.
If one of the above conditions is found, then a warning message is
displayed when the model is printed.
The pvalue used in the above tests is given by the
parameter p.value
. This can be changed by setting the value
of the parameter, or by changing the default value with
egcm.set.default.pvalue
. For all of the unit root
tests, the pvalues of the corresponding test statistics have been
recomputed through simulation and a table lookup is used. The
LjungBox test (see Box.test
) is used to assess whether
or not the residual series can be adequately fit with an autoregressive
series of order one.
The estimates of α and β are not only
consistent but also unbiased. Unfortunately, the estimate obtained
for ρ may be biased. Therefore, a bias correction has been
implemented for ρ. A precomputed table of biases has been
determined through simulation, and a table lookup is performed to
determine the appropriate bias correction. To turn off this
feature, set debias = FALSE
.
The helper function is.cointegrated()
takes as input an "egcm"
object E
. It returns TRUE if E
appears to represent a valid pair of cointegrated series. In other words, it checks that both X
and Y
are
integrated and that the residual series R
is free of unit roots.
The helper function is.ar1()
also takes as input an "egcm"
object E
. It returns TRUE if the
residual series R
can be adequately fit by an autoregressive model
of order one.
From the standpoint of securities trading, cointegration is thought
to provide a useful model for pairs trading. If the price series of
two securities are cointegrated, then the corresponding residual
series R[i]
will be meanreverting. When the magnitude of the residual
R[N]
is large, a trader might establish a long position in the
undervalued security and a short position in the overvalued security.
With high probability, the positions will converge in value, and a
profit can be collected. Numerous scholarly articles and several
books have been written on pairs trading.
Data mining for cointegrated pairs is not recommended, though. As with any statistical test, the cointegration test will generate false positives. Experience shows that at least in the case of the components of the S&P 500, the number of false positives overwhelms the number of truly cointegrated series.
Returns an S3 object of class "egcm"
. This can then be
print
ed or plot
ted. There is also a summary
method.
The following is a copy of the printed output that was obtained from running the first example below:
1 2 3 4 5  VOO[i] = 0.9201 SPY[i]  0.6845 + R[i],
(0.0005) (0.0845)
R[i] = 0.0004 R[i1] + eps[i], eps ~ N(0, 0.0779^2)
(0.0633)
R[20131231] = 0.0987 (t = 1.265)

The first line of the output shows the fit that was found. The parameters were determined to be β = 0.9201, α = 0.6845 and ρ = 0.0004. The standard deviation of the sequence ε of innovations was found to be 0.0779. The standard errors of α, β and ρ were found to be 0.0845, 0.0005 and 0.0633 respectively.
The third line of output shows the value of the residual as
of the last observation in the series. The sign of the value
0.0987 indicates that VOO
was relatively undervalued
on this date and that the difference between the two series was
1.265 standard deviations from their historical mean.
The fields of the "egcm"
object are as follows:
S1 
the first data series ( 
S2 
the second data series ( 
residuals 
the residual series ( 
innovations 
the sequence of innovations (ε 
index 
the index vector for the series 
i1test 
the name of the test used for verifying
that 
urtest 
the name of the test used for verifying that the residual series does not contain a unit root 
pvalue 
the pvalue that is used for the various tests used by this model 
log 
Boolean, which if true indicates that S1 and S2 are logged 
alpha 
the computed value of α 
alpha.se 
standard error of the estimate of α 
beta 
the computed value of β 
beta.se 
standard error of the estimate of β 
rho 
the computed and debiased value of ρ 
rho.raw 
the value of ρ determined prior to debiasing 
rho.se 
standard error of the estimate of ρ 
s1.i1.stat 
test statistic found when checking that S1 is integrated 
s1.i1.p 
pvalue associated to 
s2.i1.stat 
test statistic found when checking that S2 is integrated 
s2.i1.p 
pvalue associated to 
r.stat 
test statistic found when checking whether the residual series contains a unit root 
r.p 
pvalue associated to 
eps.ljungbox.stat 
test statistic found when checking whether an AR(1) model adequately fits the residual series 
eps.ljungbox.p 
pvalue associated to 
s1.dsd 
standard deviation of 
s2.dsd 
standard deviation of 
r.sd 
standard deviation of 
eps.sd 
standard deviation of the innovations ε[i] 
The software in this package is for general information purposes only. It is hoped that it will be useful, but it is provided WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. It is not intended to form the basis of any investment decision. USE AT YOUR OWN RISK!
Cointegration is a more general concept than has been presented here.
Users who wish to explore more general models for cointegration are
referred to the urca
package of Bernard Pfaff.
Matthew Clegg [email protected]
Chan, E. (2013). Algorithmic trading: winning strategies and their rationale. (Vol. 625). John Wiley & Sons.
Clegg, M. (2014). On the Persistence of Cointegration in Pairs Trading (January 28, 2014). Available at SSRN: http://ssrn.com/abstract=2491201
Ehrman, D.S. (2006). The handbook of pairs trading: strategies using equities, options, and futures. (Vol. 240). John Wiley & Sons.
Engle, R. F. and C. W. Granger. (1987) Cointegration and error correction: representation, estimation, and testing. Econometrica, 251276.
Pfaff, B. (2008) Analysis of Integrated and Cointegrated Time Series with R. Second Edition. Springer, New York. ISBN 0387279601
Vidyamurthy, G. (2004). Pairs trading: quantitative methods and analysis. (Vol 217). Wiley.com.
yegcm egcm.default.i1test egcm.default.urtest egcm.default.pvalue sim.egcm pgff.test bvr.test ca.jo
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43  library(TTR)
# SPY and IVV are both ETF's that track the S&P 500.
# One would expect them to be cointegrated, and in 2013 they were.
spy2013 < getYahooData("SPY", 20130101, 20131231)$Close
ivv2013 < getYahooData("IVV", 20130101, 20131231)$Close
egcm(spy2013, ivv2013)
# egcm has a plot method, which can be useful
# In this plot, it appears that there is only one price series,
# but that is because the two price series are so close to each
# other that they are indistinguishable.
plot(egcm(spy2013, ivv2013))
# The yegcm method provides a convenient interface to the TTR
# package, which can fetch closing prices from Yahoo. Thus,
# the above can be simplified as follows:
e < yegcm("SPY", "VOO", 20130101, 20140101)
print(e)
plot(e)
summary(e)
# GLD and IAU both track the price of gold.
# They too tend to be very tightly cointegrated.
gld.iau.2013 < yegcm("GLD", "IAU", 20130101, 20131231)
gld.iau.2013
plot(gld.iau.2013)
# Cocacola and Pepsi are often mentioned as an
# example of a pair of securities for which pairs trading
# may be fruitful. However, at least in 2013, they were not
# cointegrated.
ko.pep.2013 < yegcm("KO", "PEP", 20130101, 20131231)
ko.pep.2013
plot(ko.pep.2013)
# Ford and GM seemed to be even more tightly linked.
# Yet, the degree of linkage was not high enough to pass the
# cointegration test.
f.gm.2013 < yegcm("F","GM", 20130101, 20131231)
f.gm.2013
plot(f.gm.2013)

Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.