testLMNormal: Apply Goodness of Fit Test to Residuals of a Linear Model
In gofedf: Goodness of Fit Tests Based on Empirical Distribution Functions

testLMNormal

R Documentation

Apply Goodness of Fit Test to Residuals of a Linear Model

Description

testLMNormal is used to check the normality assumption of residuals in a linear model. This function can take the response variable and design matrix, fit a linear model, and apply the goodness-of-fit test. Conveniently, it can take an object of class "lm" and directly applies the goodness-of-fit test. The function returns a goodness-of-fit statistic along with an approximate p-value.

Usage

testLMNormal(
  x,
  y,
  fit = NULL,
  discretize = FALSE,
  ngrid = length(y),
  gridpit = TRUE,
  hessian = FALSE,
  method = "cvm"
)

Arguments

`x`	is either a numeric vector or a design matrix. In the design matrix, rows indicate observations and columns presents covariates.
`y`	is a vector of numeric values with the same number of observations or number of rows as x.
`fit`	an object of class "lm" returned by `lm` function in `stats` package. The default value of fit is NULL. If any object is provided, `x` and `y` will be ignored and the class of object is checked. If you pass an object to `fit` make sure to return the design matrix by setting `x` = `TRUE` and the response variable by setting in `y` = `TRUE` in `lm` function. To read more about this see the help documentation for `lm` function or see the example below.
`discretize`	If `TRUE`, the covariance function of `W_{n}(u)` process is evaluated at some data points (see `ngrid` and `gridpit`), and the integral equation is replaced by a matrix equation. If `FALSE` (the default value), the covariance function is first estimated, and then the integral equation is solved to find the eigenvalues. The results of our simulations recommend using the estimated covariance for solving the integral equation. The parameters `ngrid`, `gridpit`, and `hessian` are only relevant when `discretize = TRUE`.
`ngrid`	the number of equally spaced points to discretize the (0,1) interval for computing the covariance function.
`gridpit`	logical. If `TRUE` (the default value), the parameter ngrid is ignored and (0,1) interval is divided based on probability integral transforms or PITs obtained from the sample. If `FALSE`, the interval is divided into ngrid equally spaced points for computing the covariance function.
`hessian`	logical. If `TRUE` the Fisher information matrix is estimated by the observed Hessian Matrix based on the sample. If `FALSE` (the default value) the Fisher information matrix is estimated by the variance of the observed score matrix.
`method`	a character string indicating which goodness-of-fit statistic is to be computed. The default value is 'cvm' for the Cramer-von-Mises statistic. Other options include 'ad' for the Anderson-Darling statistic, and 'both' to compute both cvm and ad.

Value

A list of two containing the following components:

Statistic: the value of goodness-of-fit statistic.
p-value: the approximate p-value for the goodness-of-fit test. if method = 'cvm' or method = 'ad', it returns a numeric value for the statistic and p-value. If method = 'both', it returns a numeric vector with two elements and one for each statistic.

Examples

set.seed(123)
n <- 50
p <- 5
x <- matrix( runif(n*p), nrow = n, ncol = p)
e <- rnorm(n)
b <- runif(p)
y <- x %*% b + e
testLMNormal(x, y)
# Or pass lm.fit object directly:
lm.fit <- lm(y ~ x, x = TRUE, y = TRUE)
testLMNormal(fit = lm.fit)

gofedf documentation built on June 8, 2025, 10:52 a.m.