R/breusch_pagan.R

Defines functions breusch_pagan

Documented in breusch_pagan

#' Breusch-Pagan Test for Heteroskedasticity in a Linear Regression Model
#'
#' This function implements the popular method of
#'    \insertCite{Breusch79;textual}{skedastic} for testing for
#'    heteroskedasticity in a linear regression model, with or without the
#'    studentising modification of \insertCite{Koenker81;textual}{skedastic}.
#'
#' @details The Breusch-Pagan Test entails fitting an auxiliary regression
#'    model in which the response variable is the vector of squared residuals
#'    from the original model and the design matrix \eqn{Z} consists of one or
#'    more exogenous variables that are suspected of being related to the error
#'    variance. In the absence of prior information on a possible choice of
#'    \eqn{Z}, one would typically use the explanatory variables from the
#'    original model. Under the null hypothesis of homoskedasticity, the
#'    distribution of the test statistic is asymptotically chi-squared with
#'    \code{parameter} degrees of freedom. The test is right-tailed.
#'
#' @param mainlm Either an object of \code{\link[base]{class}} \code{"lm"}
#'    (e.g., generated by \code{\link[stats]{lm}}), or
#'    a list of two objects: a response vector and a design matrix. The objects
#'    are assumed to be in that order, unless they are given the names
#'    \code{"X"}  and \code{"y"} to distinguish them. The design matrix passed
#'    in a list must begin with a column of ones if an intercept is to be
#'    included in the linear model. The design matrix passed in a list should
#'    not contain factors, as all columns are treated 'as is'. For tests that
#'    use ordinary least squares residuals, one can also pass a vector of
#'    residuals in the list, which should either be the third object or be
#'    named \code{"e"}.
#' @param auxdesign A \code{\link[base]{data.frame}} or
#'    \code{\link[base]{matrix}} representing an auxiliary design matrix of
#'    containing exogenous variables that (under alternative hypothesis) are
#'    related to error variance, or a character "fitted.values" indicating
#'    that the fitted \eqn{\hat{y}_i} values from OLS should be used.
#'    If set to \code{NA} (the default), the
#'    design matrix of the original regression model is used. An intercept
#'    is included in the auxiliary regression even if the first column of
#'    \code{auxdesign} is not a vector of ones.
#' @param koenker A logical. Should studentising modification of
#'    \insertCite{Koenker81;textual}{skedastic} be implemented? Defaults to
#'    \code{TRUE}; if \code{FALSE}, the original form of the test proposed by
#'    \insertCite{Breusch79;textual}{skedastic} is used.
#' @param statonly A logical. If \code{TRUE}, only the test statistic value
#'    is returned, instead of an object of \code{\link[base]{class}}
#'    \code{"htest"}. Defaults to \code{FALSE}.
#'
#' @return An object of \code{\link[base]{class}} \code{"htest"}. If object
#'    is not assigned, its attributes are displayed in the console as a
#'    \code{\link[tibble]{tibble}} using \code{\link[broom]{tidy}}.
#' @references{\insertAllCited{}}
#' @importFrom Rdpack reprompt
#' @export
#' @seealso \code{\link[lmtest:bptest]{lmtest::bptest}}, which performs exactly
#'    the same test as this function; \code{\link[car:ncvTest]{car::ncvTest}},
#'    which is not the same test but is implemented in
#'    \code{\link{cook_weisberg}}; \code{\link{white}}, which is a special
#'    case of the Breusch-Pagan Test.
#'
#' @examples
#' mtcars_lm <- lm(mpg ~ wt + qsec + am, data = mtcars)
#' breusch_pagan(mtcars_lm)
#' breusch_pagan(mtcars_lm, koenker = FALSE)
#' # Same as first example
#' mtcars_list <- list("y" = mtcars$mpg, "X" = cbind(1, mtcars$wt, mtcars$qsec, mtcars$am))
#' breusch_pagan(mtcars_list)
#'

breusch_pagan <- function(mainlm, auxdesign = NA, koenker = TRUE,
                           statonly = FALSE) {

  auxfitvals <- ifelse(all(is.na(auxdesign)) | is.null(auxdesign), FALSE,
                                    auxdesign == "fitted.values")
  processmainlm(m = mainlm, needy = auxfitvals, needyhat = auxfitvals,
                needp = FALSE)

  if (all(is.na(auxdesign)) || is.null(auxdesign)) {
    Z <- X
  } else if (is.character(auxdesign)) {
    if (auxdesign == "fitted.values") {
      Z <- t(t(yhat))
    } else stop("Invalid character value for `auxdesign`")
  } else {
    Z <- auxdesign
    if (nrow(auxdesign) != nrow(X)) stop("No. of observations in `auxdesign`
                                         must match\nno. of observations in
                                         original model.")
  }

  hasintercept <- columnof1s(Z)
  if (!hasintercept[[1]]) {
    Z <- cbind(1, Z)
    message("Column of 1's added to `auxdesign`")
  }

  q <- ncol(Z) - 1
  n <- nrow(Z)
  sigma_barsq <- sum(e ^ 2) / n
  w_hat <- e ^ 2 - sigma_barsq

  if (koenker) {
    method <- "Koenker (studentised)"
    teststat <- n * sum(stats::lm.fit(Z, w_hat)$fitted.values ^ 2) / sum(w_hat ^ 2)
  } else {
    method <- "Breusch-Pagan (non-studentised)"
    teststat <- sum(stats::lm.fit(Z, w_hat)$fitted.values ^ 2) / (2 * sigma_barsq ^ 2)
  }
  if (statonly) return(teststat)
  pval <- stats::pchisq(teststat, df = q, lower.tail = FALSE)

  rval <- structure(list(statistic = teststat, parameter = q, p.value = pval,
               null.value = "Homoskedasticity",
               alternative = "greater", method = method), class = "htest")
  broom::tidy(rval)
}

Try the skedastic package in your browser

Any scripts or data that you put into this service are public.

skedastic documentation built on Nov. 10, 2022, 5:43 p.m.