compare_models: Comparison of nested models

View source: R/compare_models.R

compare_modelsR Documentation

Comparison of nested models

Description

Compares nested models using the adjusted likelihood ratio test statistic (ALRTS) described in Section 3.5 of Chandler and Bate (2007). The nesting must result from the simple constraint that a subset of the parameters of the larger model is held fixed.

Usage

compare_models(
  larger,
  smaller = NULL,
  approx = FALSE,
  type = c("vertical", "cholesky", "spectral", "none"),
  fixed_pars = NULL,
  fixed_at = rep_len(0, length(fixed_pars)),
  init = NULL,
  ...
)

Arguments

larger

An object of class "chandwich" returned by adjust_loglik. The larger of the two models.

smaller

An object of class "chandwich" returned by adjust_loglik. The smaller of the two models.

If smaller is supplied then the arguments fixed_pars and fixed_at described below are ignored.

approx

A logical scalar. If approx = TRUE then the approximation detailed by equations (18)-(20) of Chandler and Bate (2007) is used. This option is available only if smaller is supplied. If smaller is not supplied then approx = TRUE is used, with no warning.

The approximation doesn't make sense if type = "none". If type = "none" and approx = TRUE then approx is set to FALSE with no warning.

If approx = FALSE then the adjusted likelihood is maximised under the restriction imposed by delta, that is, equation (17) of Chandler and Bate (2007) is used.

type

A character scalar. The argument type to the function returned by adjust_loglik, that is, the type of adjustment made to the independence loglikelihood function.

fixed_pars

A numeric vector. Indices of the components of the full parameter vector that are restricted to be equal to the value(s) in fixed_at.

fixed_at

A numeric vector of length 1 or length(fixed_pars). If length(fixed_at) = 1 then the components fixed_pars of the parameter vector are all fixed at fixed_at. If length(fixed_at) = length(fixed_pars) then the component fixed_pars[i] is fixed at fixed_at[i] for each i.

init

(Only relevant if approx = FALSE). A numeric vector of initial values for use in the search for the MLE under the smaller model. Must have length equal to the number of parameters in the smaller of the two models being compared. If init is not supplied, or it is of the wrong length, then attr(smaller, "MLE") is used if smaller is supplied and attr(larger, "MLE")[-fixed_pars] otherwise.

...

Further arguments to be passed to optim. These may include gr, method, lower, upper or control.

Details

The smaller of the two models is specified either by supplying smaller or fixed_pars. If both are supplied then smaller takes precedence.

For full details see Section 3.5 of Chandler and Bate (2007). If approx = FALSE then the a likelihood ratio test of the null hypothesis that the smaller model is a valid simplification of the larger model is carried out directly using equation (17) of Chandler and Bate (2007) based on the adjusted loglikelihood under the larger model, returned by adjust_loglik. This adjusted loglikelihood is maximised subject to the constraint that a subset of the parameters in the larger model are fixed. If smaller is supplied then this maximisation can be avoided using an approximation detailed by equations (18)-(20) of Chandler and Bate (2007), which uses the MLE under the smaller model. The same null distribution (chi-squared with degrees of freedom equal to the number of parameters that are fixed) is used in both cases.

Value

An object of class "compmod", a list with components

alrts

the adjusted likelihood ratio test statistic.

df

under the null hypothesis that the smaller model is a valid simplification of the larger model the adjusted likelihood ratio test statistic has a chi-squared distribution with df degrees of freedom.

p_value

the p-value associated with the test of the null hypothesis.

larger_mle

the MLE of the parameters under the larger model.

smaller_mle

the MLE of the parameters under the smaller model.

larger_fixed_pars, smaller_fixed_pars

Numeric vectors of the indices of parameters fixed in the larger and smaller models, respectively.

larger_fixed_at, smaller_fixed_at

Numeric vectors of the values at which the parameters in larger_fixed_pars and smaller_fixed_pars are fixed.

approx

the argument approx.

References

Chandler, R. E. and Bate, S. (2007). Inference for clustered data using the independence loglikelihood. Biometrika, 94(1), 167-183. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1093/biomet/asm015")}

See Also

adjust_loglik to adjust a user-supplied loglikelihood function.

conf_intervals for confidence intervals for individual parameters.

conf_region for a confidence region for pairs of parameters.

print.compmod.

Examples

# -------------------------- GEV model, owtemps data -----------------------
# ------------ following Section 5.2 of Chandler and Bate (2007) -----------

gev_loglik <- function(pars, data) {
  o_pars <- pars[c(1, 3, 5)] + pars[c(2, 4, 6)]
  w_pars <- pars[c(1, 3, 5)] - pars[c(2, 4, 6)]
  if (isTRUE(o_pars[2] <= 0 | w_pars[2] <= 0)) return(-Inf)
  o_data <- data[, "Oxford"]
  w_data <- data[, "Worthing"]
  check <- 1 + o_pars[3] * (o_data - o_pars[1]) / o_pars[2]
  if (isTRUE(any(check <= 0))) return(-Inf)
  check <- 1 + w_pars[3] * (w_data - w_pars[1]) / w_pars[2]
  if (isTRUE(any(check <= 0))) return(-Inf)
  o_loglik <- log_gev(o_data, o_pars[1], o_pars[2], o_pars[3])
  w_loglik <- log_gev(w_data, w_pars[1], w_pars[2], w_pars[3])
  return(o_loglik + w_loglik)
}

# Initial estimates (method of moments for the Gumbel case)
sigma <- as.numeric(sqrt(6 * diag(var(owtemps))) / pi)
mu <- as.numeric(colMeans(owtemps) - 0.57722 * sigma)
init <- c(mean(mu), -diff(mu) / 2, mean(sigma), -diff(sigma) / 2, 0, 0)

# Log-likelihood adjustment of the full model
par_names <- c("mu[0]", "mu[1]", "sigma[0]", "sigma[1]", "xi[0]", "xi[1]")
large <- adjust_loglik(gev_loglik, data = owtemps, init = init,
         par_names = par_names)

# Log-likelihood adjustment of some smaller models: xi[1] = 0 etc

medium <- adjust_loglik(larger = large, fixed_pars = "xi[1]")
small <- adjust_loglik(larger = medium, fixed_pars = c("sigma[1]", "xi[1]"))

# Tests

# Test xi1 = 0 (2 equivalent ways), vertical adjustment
compare_models(large, fixed_pars = "xi[1]")
compare_models(large, medium)
# Test xi1 = 0, using approximation
compare_models(large, medium, approx = TRUE)

# Horizontal adjustments
compare_models(large, medium, type = "cholesky")$p_value
compare_models(large, medium, type = "spectral")$p_value
# No adjustment (independence loglikelihood)
compare_models(large, medium, type = "none")$p_value

# Test sigma1 = 0 for model with xi1 = 0
compare_models(medium, small)
# Test sigma1 = xi1 = 0
compare_models(large, small)

# --------- Misspecified Poisson model for negative binomial data ----------

# ... following Section 5.1 of the "Object-Oriented Computation of Sandwich
# Estimators" vignette of the sandwich package
# https://cran.r-project.org/web/packages/sandwich/vignettes/sandwich-OOP.pdf

# Simulate data
set.seed(123)
x <- rnorm(250)
y <- rnbinom(250, mu = exp(1 + x), size = 1)
# Fit misspecified Poisson model
fm_pois <- glm(y ~ x + I(x^2), family = poisson)
summary(fm_pois)$coefficients

# Contributions to the independence loglikelihood
pois_glm_loglik <- function(pars, y, x) {
  log_mu <- pars[1] + pars[2] * x + pars[3] * x ^ 2
  return(dpois(y, lambda = exp(log_mu), log = TRUE))
}
pars <- c("alpha", "beta", "gamma")
pois_quad <- adjust_loglik(pois_glm_loglik, y = y, x = x, par_names = pars)
summary(pois_quad)

pois_lin <- adjust_loglik(larger = pois_quad, fixed_pars = "gamma")

# Test the significance of the quadratic term
compare_models(pois_quad, pois_lin)$p_value
compare_models(pois_quad, pois_lin, approx = TRUE)$p_value

chandwich documentation built on Aug. 26, 2023, 1:07 a.m.