Goodness-of-Fit tests in linear regression models

Share:

Description

This routine tests the equality of the vector of coefficients, β, in a linear regression model and a given parameter vector, β_0, from a sample {(Y_i, X_{i1},...,X_{ip}): i=1,...,n}, where:

β = (β_1,...,β_p)

is an unknown vector parameter and

Y_i = X_{i1}*β_1+ ... + X_{ip}*β_p + ε_i.

The random errors, ε_i, are allowed to be time series. The test statistic used for testing the null hypothesis, H0: β = β_0, derives from the asymptotic normality of the ordinary least squares estimator of β, this result giving a χ^2-test.

Usage

1
2
3
par.gof(data = data, beta0 = NULL, time.series = FALSE, 
Var.Cov.eps = NULL, p.max = 3, q.max = 3, ic = "BIC", 
num.lb = 10, alpha = 0.05)

Arguments

data

data[, 1] contains the values of the response variable, Y;

data[, 2:(p+1)] contains the values of the explanatory variables, X_1, ..., X_p.

beta0

the considered parameter vector in the null hypothesis. If NULL (the default), the zero vector is considered.

time.series

it denotes whether the data are independent (FALSE) or if data is a time series (TRUE). The default is FALSE.

Var.Cov.eps

n x n matrix of variances-covariances associated to the random errors of the regression model. If NULL (the default), the function tries to estimate it: it fits an ARMA model (selected according to an information criterium) to the residuals from the fitted linear regression model and, then, it obtains the var-cov matrix of such ARMA model.

p.max

if Var.Cov.eps=NULL, the ARMA model is selected between the models ARMA(p,q) with 0<=p<=p.max and 0<=q<=q.max. The default is 3.

q.max

if Var.Cov.eps=NULL, the ARMA model is selected between the models ARMA(p,q) with 0<=p<=p.max and 0<=q<=q.max. The default is 3.

ic

if Var.Cov.eps=NULL, ic contains the information criterion used to suggest the ARMA model. It allows us to choose between: "AIC", "AICC" or "BIC" (the default).

num.lb

if Var.Cov.eps=NULL, it checks the suitability of the selected ARMA model according to the Ljung-Box test and the t-test. It uses up to num.lb delays in the Ljung-Box test. The default is 10.

alpha

if Var.Cov.eps=NULL, alpha contains the significance level which the ARMA model is checked. The default is 0.05.

Details

If Var.Cov.eps=NULL and the routine is not able to suggest an approximation for Var.Cov.eps, it warns the user with a message saying that the model could be not appropriate and then it shows the results. In order to construct Var.Cov.eps, the procedure suggested in Domowitz (1982) can be followed.

The implemented procedure particularizes the parametric test in the routine plrm.gof to the case where is known that the nonparametric component in the corresponding PLR model is null.

Value

A list with a dataframe containing:

Q.beta

value of the test statistic.

p.value

p-value of the corresponding statistic test.

Moreover, if data is a time series and Var.Cov.eps is not especified:

pv.Box.test

p-values of the Ljung-Box test for the model fitted to the residuals.

pv.t.test

p-values of the t.test for the model fitted to the residuals.

ar.ma

ARMA orders for the model fitted to the residuals.

Author(s)

German Aneiros Perez ganeiros@udc.es

Ana Lopez Cheda ana.lopez.cheda@udc.es

References

Domowitz, J. (1982) The linear model with stochastic regressors and heteroscedastic dependent errors. Discussion paper No 543, Center for Mathematical studies in Economic and Management Science, Northwestern University, Evanston, Illinois.

Judge, G.G., Griffiths, W.E., Carter Hill, R., Lutkepohl, H. and Lee, T-C. (1980) The Theory and Practice of Econometrics. Wiley.

Seber, G.A.F. (1977) Linear Regression Analysis. Wiley.

See Also

Other related functions are np.gof and plrm.gof.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
# EXAMPLE 1: REAL DATA
data(barnacles1)
data <- as.matrix(barnacles1)
data <- diff(data, 12)
data <- cbind(data[,1],1,data[,-1])

## Example 1.1: false null hypothesis
par.gof(data)
## Example 1.2: true null hypothesis
par.gof(data, beta0=c(0,0.15,0.4))



# EXAMPLE 2: SIMULATED DATA
## Example 2a: dependent data

set.seed(1234)
# We generate the data
n <- 100
beta <- c(0.05, 0.01)

x <- matrix(rnorm(200,0,1), nrow=n)
sum <- x%*%beta
epsilon <- arima.sim(list(order = c(1,0,0), ar=0.7), sd = 0.01, n = n)
y <-  sum + epsilon
data <- cbind(y,x)

## Example 2a.1: true null hypothesis
par.gof(data, beta0=c(0.05, 0.01))

## Example 2a.2: false null hypothesis
par.gof(data) 

Want to suggest features or report bugs for rdrr.io? Use the GitHub issue tracker.