Regression Tests
Description
A collection and description of functions
to test linear regression models, including
tests for higher serial correlations, for
heteroskedasticity, for autocorrelations
of disturbances, for linearity, and functional
relations.
The methods are:
"bg"  BreuschGodfrey test for higher order serial correlation, 
"bp"  BreuschPagan test for heteroskedasticity, 
"dw"  DurbinWatson test for autocorrelation of disturbances, 
"gq"  GoldfeldQuandt test for heteroskedasticity, 
"harv"  HarveyCollier test for linearity, 
"hmc"  HarrisonMcCabe test for heteroskedasticity, 
"rain"  Rainbow test for linearity, and 
"reset"  Ramsey's RESET test for functional relation. 
There is nothing new, it's just a wrapper to the underlying test
functions from R's contributed package lmtest
. The functions
are available as "Builtin" functions. Nevertheless, the user can
still install and use the original functions from R's lmtest
package.
Usage
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15  lmTest(formula, method = c("bg", "bp", "dw", "gq", "harv", "hmc",
"rain", "reset"), data = list(), ...)
bgTest(formula, order = 1, type = c("Chisq", "F"), data = list())
bpTest(formula, varformula = NULL, studentize = TRUE, data = list())
dwTest(formula, alternative = c("greater", "two.sided", "less"),
iterations = 15, exact = NULL, tol = 1e10, data = list())
gqTest(formula, point=0.5, order.by = NULL, data = list())
harvTest(formula, order.by = NULL, data = list())
hmcTest(formula, point = 0.5, order.by = NULL, simulate.p = TRUE,
nsim = 1000, plot = FALSE, data = list())
rainTest(formula, fraction = 0.5, order.by = NULL, center = NULL,
data = list())
resetTest(formula, power = 2:3, type = c("fitted", "regressor", "princomp"),
data = list())

Arguments
alternative 
[dwTest]  
center 
[rainTest]  
data 
an optional data frame containing the variables in the model.
By default the variables are taken from the environment which

exact 
[dwTest]  
formula 
a symbolic description for the linear model to be tested. 
fraction 
[rainTest]  
iterations 
[dwTest]  
method 
the test method which should be applied. 
nsim 
[hmcTest]  
order 
[bgTest]  
order.by 
[gqTest][harvTest]  
plot 
[hmcTest]  
point 
[gqTest][hmcTest]  
power 
[resetTest]  
simulate.p 
[hmcTest]  
studentize 
[bpTest]  
tol 
[dwTest]  
type 
[bgTest]  
varformula 
[bpTest]  
... 
[regTest]  
Details
bg – Breusch Godfrey Test:
Under H_0 the test statistic is asymptotically Chisquared
with degrees of freedom as given in parameter
.
If type
is set to "F"
the function returns
the exact F statistic which, under H_0, follows an F
distribution with degrees of freedom as given in parameter
.
The starting values for the lagged residuals in the supplementary
regression are chosen to be 0.
[lmtest:bgtest]
bp – Breusch Pagan Test:
The Breusch–Pagan test fits a linear regression model to the
residuals of a linear regression model (by default the same
explanatory variables are taken as in the main regression
model) and rejects if too much of the variance
is explained by the additional explanatory variables.
Under H_0 the test statistic of the BreuschPagan test
follows a chisquared distribution with parameter
(the number of regressors without the constant in the model)
degrees of freedom.
[lmtest:bptest]
dw – Durbin Watson Test:
The Durbin–Watson test has the null hypothesis that the autocorrelation
of the disturbances is 0; it can be tested against the alternative
that it is greater than, not equal to, or less than 0 respectively.
This can be specified by the alternative
argument.
The null distribution of the DurbinWatson test statistic is a linear
combination of chisquared distributions. The p value is computed using a
Fortran version of the Applied Statistics Algorithm AS 153 by Farebrother
(1980, 1984). This algorithm is called "pan" or "gradsol". For large sample
sizes the algorithm might fail to compute the p value; in that case a
warning is printed and an approximate p value will be given; this p
value is computed using a normal approximation with mean and variance
of the DurbinWatson test statistic.
[lmtest:dwtest]
gq – Goldfeld Quandt Test:
The Goldfeld–Quandt test compares the variances of two submodels
divided by a specified breakpoint and rejects if the variances differ.
Under H_0 the test statistic of the GoldfeldQuandt test
follows an F distribution with the degrees of freedom as given in
parameter
.
[lmtest:gqtest]
harv  Harvey Collier Test:
The HarveyCollier test performs a ttest (with parameter
degrees of freedom) on the recursive residuals. If the true relationship
is not linear but convex or concave the mean of the recursive residuals
should differ from 0 significantly.
[lmtest:harvtest]
hmc – Harrison McCabe Test:
The Harrison–McCabe test statistic is the fraction of the residual
sum of squares that relates to the fraction of the data before the
breakpoint. Under H_0 the test statistic should be close to
the size of this fraction, e.g. in the default case close to 0.5.
The null hypothesis is reject if the statistic is too small.
[lmtest:hmctest]
rain – Rainbow Test:
The basic idea of the Rainbow test is that even if the true
relationship is nonlinear, a good linear fit can be achieved
on a subsample in the "middle" of the data. The null hypothesis
is rejected whenever the overall fit is significantly inferious
to the fit of the subsample. The test statistic under H_0
follows an F distribution with parameter
degrees of
freedom.
[lmtest:raintest]
reset – Ramsey's RESET Test
RESET test is popular means of diagnostic for correctness of
functional form. The basic assumption is that under the alternative,
the model can be written by the regression
y=X * beta + Z * gamma.
Z
is generated by taking powers either of the fitted response,
the regressor variables or the first principal component of X
.
A standard FTest is then applied to determin whether these additional
variables have significant influence. The test statistic under
H_0 follows an F distribution with parameter
degrees
of freedom.
[lmtest:reset]
Value
A list with class "htest"
containing the following components:
statistic 
the value of the test statistic. 
parameter 
the lag order. 
p.value 
the pvalue of the test. 
method 
a character string indicating what type of test was performed. 
data.name 
a character string giving the name of the data. 
alternative 
a character string describing the alternative hypothesis. 
Note
The underlying lmtest
package comes wit a lot of helpful
examples. We highly recommend to install the lmtest
package
and to study the examples given therein.
Author(s)
Achim Zeileis and Torsten Hothorn for the lmtest
package,
Diethelm Wuertz for the Rmetrics Rport.
References
Breusch, T.S. (1979); Testing for Autocorrelation in Dynamic Linear Models, Australian Economic Papers 17, 334–355.
Breusch T.S. and Pagan A.R. (1979); A Simple Test for Heteroscedasticity and Random Coefficient Variation, Econometrica 47, 1287–1294
Durbin J. and Watson G.S. (1950); Testing for Serial Correlation in Least Squares Regression I, Biometrika 37, 409–428.
Durbin J. and Watson G.S. (1951); Testing for Serial Correlation in Least Squares Regression II, Biometrika 38, 159–178.
Durbin J. and Watson G.S. (1971); Testing for Serial Correlation in Least Squares Regression III, Biometrika 58, 1–19.
Farebrother R.W. (1980); Pan's Procedure for the Tail Probabilities of the DurbinWatson Statistic, Applied Statistics 29, 224–227.
Farebrother R.W. (1984); The Distribution of a Linear Combination of chi^2 Random Variables, Applied Statistics 33, 366–369.
Godfrey, L.G. (1978); Testing Against General Autoregressive and Moving Average Error Models when the Regressors Include Lagged Dependent Variables, Econometrica 46, 1293–1302.
Goldfeld S.M. and Quandt R.E. (1965); Some Tests for Homoskedasticity Journal of the American Statistical Association 60, 539–547.
Harrison M.J. and McCabe B.P.M. (1979); A Test for Heteroscedasticity based on Ordinary Least Squares Residuals Journal of the American Statistical Association 74, 494–499.
Harvey A. and Collier P. (1977); Testing for Functional Misspecification in Regression Analysis, Journal of Econometrics 6, 103–119.
Johnston, J. (1984); Econometric Methods, Third Edition, McGraw Hill Inc.
Kraemer W. and Sonnberger H. (1986); The Linear Regression Model under Test, Heidelberg: Physica.
Racine J. and Hyndman R. (2002); Using R To Teach Econometrics, Journal of Applied Econometrics 17, 175–189.
Ramsey J.B. (1969); Tests for Specification Error in Classical Linear Least Squares Regression Analysis, Journal of the Royal Statistical Society, Series B 31, 350–371.
Utts J.M. (1982); The Rainbow Test for Lack of Fit in Regression, Communications in Statistics  Theory and Methods 11, 1801–1815.
Examples
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99  ## bg  dw 
# Generate a Stationary and an AR(1) Series:
x = rep(c(1, 1), 50)
y1 = 1 + x + rnorm(100)
# Perform BreuschGodfrey Test for 1st order serial correlation:
lmTest(y1 ~ x, "bg")
# ... or for fourth order serial correlation:
lmTest(y1 ~ x, "bg", order = 4)
# Compare with DurbinWatson Test Results:
lmTest(y1 ~ x, "dw")
y2 = filter(y1, 0.5, method = "recursive")
lmTest(y2 ~ x, "bg")
## bp 
# Generate a Regressor:
x = rep(c(1, 1), 50)
# Generate heteroskedastic and homoskedastic Disturbances
err1 = rnorm(100, sd = rep(c(1, 2), 50))
err2 = rnorm(100)
# Generate a Linear Relationship:
y1 = 1 + x + err1
y2 = 1 + x + err2
# Perform BreuschPagan Test
bp = lmTest(y1 ~ x, "bp")
bp
# Calculate Critical Value for 0.05 Level
qchisq(0.95, bp$parameter)
lmTest(y2 ~ x, "bp")
## dw 
# Generate two AR(1) Error Terms
# with parameter rho = 0 (white noise)
# and rho = 0.9 respectively
err1 = rnorm(100)
# Generate Regressor and Dependent Variable
x = rep(c(1,1), 50)
y1 = 1 + x + err1
# Perform DurbinWatson Test:
lmTest(y1 ~ x, "dw")
err2 = filter(err1, 0.9, method = "recursive")
y2 = 1 + x + err2
lmTest(y2 ~ x, "dw")
## gq 
# Generate a Regressor:
x = rep(c(1, 1), 50)
# Generate Heteroskedastic and Homoskedastic Disturbances:
err1 = c(rnorm(50, sd = 1), rnorm(50, sd = 2))
err2 = rnorm(100)
# Generate a Linear Relationship:
y1 = 1 + x + err1
y2 = 1 + x + err2
# Perform GoldfeldQuandt Test:
lmTest(y1 ~ x, "gq")
lmTest(y2 ~ x, "gq")
## harv 
# Generate a Regressor and Dependent Variable:
x = 1:50
y1 = 1 + x + rnorm(50)
y2 = y1 + 0.3*x^2
# Perform HarveyCollier Test:
harv = lmTest(y1 ~ x, "harv")
harv
# Calculate Critical Value vor 0.05 level:
qt(0.95, harv$parameter)
lmTest(y2 ~ x, "harv")
## hmc 
# Generate a Regressor:
x = rep(c(1, 1), 50)
# Generate Heteroskedastic and Homoskedastic Disturbances:
err1 = c(rnorm(50, sd = 1), rnorm(50, sd = 2))
err2 = rnorm(100)
# Generate a Linear Relationship:
y1 = 1 + x + err1
y2 = 1 + x + err2
# Perform HarrisonMcCabe Test:
lmTest(y1 ~ x, "hmc")
lmTest(y2 ~ x, "hmc")
## rain 
# Generate Series:
x = c(1:30)
y = x^2 + rnorm(30, 0, 2)
# Perform rainbow Test
rain = lmTest(y ~ x, "rain")
rain
# Compute Critical Value:
qf(0.95, rain$parameter[1], rain$parameter[2])
## reset 
# Generate Series:
x = c(1:30)
y1 = 1 + x + x^2 + rnorm(30)
y2 = 1 + x + rnorm(30)
# Perform RESET Test:
lmTest(y1 ~ x , "reset", power = 2, type = "regressor")
lmTest(y2 ~ x , "reset", power = 2, type = "regressor")
