R/hypotheses.R

Defines functions hypotheses

Documented in hypotheses

#' (Non-)Linear Tests for Null Hypotheses, Joint Hypotheses, Equivalence, Non Superiority, and Non Inferiority
#'
#' @description
#' Uncertainty estimates are calculated as first-order approximate standard errors for linear or non-linear functions of a vector of random variables with known or estimated covariance matrix. In that sense, [`hypotheses`] emulates the behavior of the excellent and well-established [car::deltaMethod] and [car::linearHypothesis] functions, but it supports more models; requires fewer dependencies; expands the range of tests to equivalence and superiority/inferiority; and offers convenience features like robust standard errors.
#'
#' To learn more, read the hypothesis tests vignette, visit the
#' package website, or scroll down this page for a full list of vignettes:
#'
#' * <https://marginaleffects.com/chapters/hypothesis.html>
#' * <https://marginaleffects.com/>
#'
#' Warning #1: Tests are conducted directly on the scale defined by the `type` argument. For some models, it can make sense to conduct hypothesis or equivalence tests on the `"link"` scale instead of the `"response"` scale which is often the default.
#'
#' Warning #2: For hypothesis tests on objects produced by the `marginaleffects` package, it is safer to use the `hypothesis` argument of the original function.  Using `hypotheses()` may not work in certain environments, in lists, or when working programmatically with *apply style functions.
#'
#' Warning #3: The tests assume that the `hypothesis` expression is (approximately) normally distributed, which for non-linear functions of the parameters may not be realistic. More reliable confidence intervals can be obtained using the \code{inferences()} function with `method = "boot"`.
#'
#' @inheritParams comparisons
#' @param model Model object or object generated by the `comparisons()`, `slopes()`, or `predictions()` functions.
#' @param joint Joint test of statistical significance. The null hypothesis value can be set using the `hypothesis` argument.
#' - FALSE: Hypotheses are not tested jointly.
#' - TRUE: All parameters are tested jointly.
#' - String: A regular expression to match parameters to be tested jointly. `grep(joint, perl = TRUE)`
#' - Character vector of parameter names to be tested. Characters refer to the names of the vector returned by `marginaleffects::get_coef(object)`.
#' - Integer vector of indices. Which parameters positions to test jointly.
#' - Note: When using the `joint` argument, the `hypothesis` argument is limited to `NULL` or numeric values. Users can chain multiple `hypotheses()` for complex joint hypothesis tests.
#' @param conf_level NULL or numeric value between 0 and 1. Confidence level to use to build a confidence interval. When `NULL` and `model` was generated by `marginaleffects`, the confidence level is taken from the `conf_level` attribute of the model. Otherwise, the default value is 0.95.
#' @param joint_test A character string specifying the type of test, either "f" or "chisq". The null hypothesis is set by the `hypothesis` argument, with default null equal to 0 for all parameters.
#' @param df Degrees of freedom used to compute p values and confidence intervals. 
#'   - A single numeric value between 1 and `Inf`, or a numeric vector with length equal to the number of rows in the output. When `df` is `Inf`, the normal distribution is used. When `df` is finite, the `t` distribution is used. 
#'   - "residual": Calls [insight::get_df] to extract degrees of freedom from the model automatically.
#'   - When using `joint_test="f"`, the `df` argument should be a numeric vector of length 2.
#' @param multcomp Logical or string. If `TRUE` or string, apply multiple comparison adjustment to the p values and report family-wise confidence intervals. Valid strings: "holm", "hochberg", "hommel", "bonferroni", "BH", "BY", "fdr", "single-step", "Shaffer", "Westfall", "free". When `multcomp` is `TRUE`, the "holm" method is used.
#'
#' @section Joint hypothesis tests:
#' The test statistic for the joint Wald test is calculated as (R * theta_hat - r)' * inv(R * V_hat * R') * (R * theta_hat - r) / Q,
#' where theta_hat is the vector of estimated parameters, V_hat is the estimated covariance matrix, R is a Q x P matrix for testing Q hypotheses on P parameters,
#' r is a Q x 1 vector for the null hypothesis, and Q is the number of rows in R. If the test is a Chi-squared test, the test statistic is not normalized.
#'
#' The p-value is then calculated based on either the F-distribution (for F-test) or the Chi-squared distribution (for Chi-squared test).
#' For the F-test, the degrees of freedom are Q and (n - P), where n is the sample size and P is the number of parameters.
#' For the Chi-squared test, the degrees of freedom are Q.
#'
#' @template equivalence
#' @examples
#' library(marginaleffects)
#' mod <- lm(mpg ~ hp + wt + factor(cyl), data = mtcars)
#'
#' hypotheses(mod)
#'
#' # Test of equality between coefficients
#' hypotheses(mod, hypothesis = "hp = wt")
#'
#' # Non-linear function
#' hypotheses(mod, hypothesis = "exp(hp + wt) = 0.1")
#'
#' # Robust standard errors
#' hypotheses(mod, hypothesis = "hp = wt", vcov = "HC3")
#'
#' # b1, b2, ... shortcuts can be used to identify the position of the
#' # parameters of interest in the output of
#' hypotheses(mod, hypothesis = "b2 = b3")
#'
#' # wildcard
#' hypotheses(mod, hypothesis = "b* / b2 = 1")
#'
#' # term names with special characters have to be enclosed in backticks
#' hypotheses(mod, hypothesis = "`factor(cyl)6` = `factor(cyl)8`")
#'
#' mod2 <- lm(mpg ~ hp * drat, data = mtcars)
#' hypotheses(mod2, hypothesis = "`hp:drat` = drat")
#'
#' # predictions(), comparisons(), and slopes()
#' mod <- glm(am ~ hp + mpg, data = mtcars, family = binomial)
#' cmp <- comparisons(mod, newdata = "mean")
#' hypotheses(cmp, hypothesis = "b1 = b2")
#'
#' mfx <- slopes(mod, newdata = "mean")
#' hypotheses(cmp, hypothesis = "b2 = 0.2")
#'
#' pre <- predictions(mod, newdata = datagrid(hp = 110, mpg = c(30, 35)))
#' hypotheses(pre, hypothesis = "b1 = b2")
#'
#' # The `hypothesis` argument can be used to compute standard errors for fitted values
#' mod <- glm(am ~ hp + mpg, data = mtcars, family = binomial)
#'
#' f <- function(x) predict(x, type = "link", newdata = mtcars)
#' p <- hypotheses(mod, hypothesis = f)
#' head(p)
#'
#' f <- function(x) predict(x, type = "response", newdata = mtcars)
#' p <- hypotheses(mod, hypothesis = f)
#' head(p)
#'
#' # Complex aggregation
#' # Step 1: Collapse predicted probabilities by outcome level, for each individual
#' # Step 2: Take the mean of the collapsed probabilities by group and `cyl`
#' library(dplyr)
#' library(MASS)
#' library(dplyr)
#' library(magrittr)
#'
#' dat <- transform(mtcars, gear = factor(gear))
#' mod <- polr(gear ~ factor(cyl) + hp, dat)
#'
#' aggregation_fun <- function(x) {
#'   predictions(x, vcov = FALSE) %>%
#'     mutate(group = ifelse(group %in% c("3", "4"), "3 & 4", "5")) %>%
#'     summarize(estimate = sum(estimate), .by = c("rowid", "cyl", "group")) %>%
#'     summarize(estimate = mean(estimate), .by = c("cyl", "group")) %>%
#'     rename(term = cyl)
#' }
#'
#' hypotheses(mod, hypothesis = aggregation_fun)
#'
#' # Equivalence, non-inferiority, and non-superiority tests
#' mod <- lm(mpg ~ hp + factor(gear), data = mtcars)
#' p <- predictions(mod, newdata = "median")
#' hypotheses(p, equivalence = c(17, 18))
#'
#' mfx <- avg_slopes(mod, variables = "hp")
#' hypotheses(mfx, equivalence = c(-.1, .1))
#'
#' cmp <- avg_comparisons(mod, variables = "gear", hypothesis = ~pairwise)
#' hypotheses(cmp, equivalence = c(0, 10))
#'
#' # joint hypotheses: character vector
#' model <- lm(mpg ~ as.factor(cyl) * hp, data = mtcars)
#' hypotheses(model, joint = c("as.factor(cyl)6:hp", "as.factor(cyl)8:hp"))
#'
#' # joint hypotheses: regular expression
#' hypotheses(model, joint = "cyl")
#'
#' # joint hypotheses: integer indices
#' hypotheses(model, joint = 2:3)
#'
#' # joint hypotheses: different null hypotheses
#' hypotheses(model, joint = 2:3, hypothesis = 1)
#' hypotheses(model, joint = 2:3, hypothesis = 1:2)
#'
#' # joint hypotheses: marginaleffects object
#' cmp <- avg_comparisons(model)
#' hypotheses(cmp, joint = "cyl")
#'
#' # Multiple comparison adjustment
#' # p values and family-wise confidence intervals
#' cmp <- avg_comparisons(model)
#' hypotheses(cmp, multcomp = "hochberg")
#'
#' @export
hypotheses <- function(
    model,
    hypothesis = NULL,
    vcov = NULL,
    conf_level = NULL,
    df = NULL,
    equivalence = NULL,
    joint = FALSE,
    joint_test = "f",
    multcomp = FALSE,
    numderiv = "fdforward",
    ...
) {
    hypothesis_is_formula <- isTRUE(checkmate::check_formula(hypothesis))

    if (inherits(model, c("predictions", "comparisons", "slopes", "hypotheses"))) {
        if (!is.null(vcov)) {
            msg <- "The `vcov` argument is not available when `model` is a `predictions`, `comparisons`, `slopes`, or `hypotheses` object. Please specify the type of standard errors in the initial `marginaleffects` call."
            stop(msg, call. = FALSE)
        }
    }

    checkmate::assert_number(conf_level, null.ok = TRUE, lower = 0, upper = 1)
    if (is.null(conf_level)) {
        # inherit conf_level from marginaleffects objects
        conf_level <- attr(model, "conf_level")
        if (is.null(conf_level)) {
            conf_level <- 0.95
        }
    }

    # I don't know how to adjust p values for a different null in `multcomp::glht()`
    if (!isFALSE(multcomp) && isTRUE(checkmate::check_number(hypothesis))) {
        msg <- "The `multcomp` argument is not available when `hypothesis` is a number."
        stop(msg, call. = FALSE)
    }

    if (!isFALSE(multcomp) && !isFALSE(joint)) {
        msg <- "The `multcomp` argument cannot be used with the `joint` argument."
        insight::format_error(msg)
    }

    call_attr <- construct_call(model, "hypotheses")

    if ("modeldata" %in% ...names()) {
        call_attr[["modeldata"]] <- ...elt(match("modeldata", ...names())[1L])
    }

    # multiple imputation
    if (inherits(model, c("mira", "amest"))) {
        out <- process_imputation(model, call_attr)
        return(out)
    }

    ###### Joint test
    if (!isFALSE(joint)) {
        out <- joint_test(
            object = model,
            joint_index = joint,
            joint_test = joint_test,
            hypothesis = hypothesis,
            df = df,
            vcov = vcov
        )
        return(out)
    }
    ###### Done with Joint test

    # after joint test, because ftest() can require two values
    df <- if (is.null(df)) Inf else df
    df <- sanitize_df(df, model, vcov = vcov)
    df <- get_degrees_of_freedom(df = df, model = model)

    args <- list(
        conf_level = conf_level,
        vcov = vcov,
        df = df,
        equivalence = equivalence
    )

    # keep this NULL in case `hypothesis` was used in the previous call
    args[["hypothesis"]] <- hypothesis

    if (...length() > 0) {
        args <- utils::modifyList(args, list(...))
    }

    xcall <- substitute(model)

    if (is.symbol(xcall)) {
        model <- eval(xcall, envir = parent.frame())
    } else if (is.call(xcall)) {
        model <- eval(xcall, envir = parent.frame())
    }

    numderiv <- sanitize_numderiv(numderiv)

    # after re-evaluation
    tmp <- sanitize_hypothesis(hypothesis, ...)
    hypothesis <- tmp$hypothesis
    hypothesis_null <- tmp$hypothesis_null
    hypothesis_direction <- tmp$hypothesis_direction

    vcov_false <- isFALSE(vcov)
    if (!isTRUE(checkmate::check_matrix(vcov))) {
        vcov <- get_vcov(model = model, vcov = vcov)
        vcov.type <- get_vcov_label(vcov = vcov)
    } else {
        vcov.type <- "matrix"
    }

    FUNouter <- function(model, hypothesis, newparams = NULL, ...) {
        if (isTRUE(checkmate::check_numeric(model))) {
            out <- data.frame(term = seq_along(out), estimate = out)
        } else if (inherits(model, "data.frame")) {
            out <- model
            if (!"estimate" %in% colnames(out)) {
                msg <- "`hypothesis` function must return a data.frame with a column named `estimate`."
                insight::format_error(msg)
            }
            if (!"term" %in% colnames(out)) {
                n <- tryCatch(names(stats::coef(model)), error = function(e) NULL)
                if (is.null(n)) {
                    n <- paste0("b", seq_len(nrow(out)))
                }
                out$term <- n
            }
            if (!all(c("term", "estimate") %in% colnames(out))) {
                msg <- "`hypothesis` function must return a data.frame with two columns named `term` and `estimate`."
                insight::format_error(msg)
            }

            # unknown model
        } else if (!is.function(hypothesis)) {
            out <- insight::get_parameters(model, ...)
            if ("Component" %in% colnames(out) && !anyNA(out$Component)) {
                out$Parameter <- sprintf("%s_%s", out$Component, out$Parameter)
            } else if ("Response" %in% colnames(out) && !anyNA(out$Response)) {
                out$Parameter <- sprintf("%s_%s", out$Response, out$Parameter)
            }
            idx <- intersect(colnames(model), c("term", "group", "estimate"))
            colnames(out)[1:2] <- c("term", "estimate")

            # glmmTMB
            if (!is.null(newparams)) {
                out$estimate <- newparams
            }
        } else if (hypothesis_is_formula) {
            beta <- get_coef(model)
            out <- data.table::data.table(estimate = beta, term = names(beta))

            # unknown model but user-supplied hypothesis function
        } else {
            out <- model
        }

        tmp <- get_hypothesis(out, hypothesis = hypothesis)
        out <- tmp$estimate
        hypothesis_function_by <- attr(tmp, "hypothesis_function_by")

        # labels
        lab <- c("hypothesis", "term", hypothesis_function_by)
        lab <- intersect(lab, colnames(tmp))
        if (length(lab) > 0) {
            lab <- tmp[, ..lab]
            attr(out, "label") <- lab
        }

        if ("group" %in% colnames(tmp)) {
            attr(out, "grouplab") <- tmp[["group"]]
        }

        attr(out, "hypothesis_function_by") <- hypothesis_function_by
        return(out)
    }

    b <- FUNouter(model = model, hypothesis = hypothesis, ...)

    # bayesian posterior
    if (!is.null(attr(b, "posterior_draws"))) {
        draws <- attr(b, "posterior_draws")
        J <- NULL
        se <- rep(NA, length(b))

        # standard errors via delta method
    } else if (!vcov_false && isTRUE(checkmate::check_matrix(vcov))) {
        args <- list(
            model = model,
            vcov = vcov,
            hypothesis = hypothesis,
            FUN = FUNouter,
            numderiv = numderiv
        )
        if (...length() > 0) {
            args <- utils::modifyList(args, list(...))
        }
        se <- do.call("get_se_delta", args)
        J <- attr(se, "jacobian")
        attr(se, "jacobian") <- NULL
        draws <- NULL

        # no standard error
    } else {
        J <- draws <- NULL
        se <- rep(NA, length(b))
    }

    hyplab <- attr(b, "label")
    if (!is.null(hypothesis)) {
        if (is.null(hyplab)) {
            hyplab <- attr(hypothesis, "label")
        }
        if (!is.null(hyplab)) {
            out <- cbind(
                hyplab,
                data.frame(
                    estimate = b,
                    std.error = se
                )
            )
        } else {
            out <- data.frame(
                term = "custom",
                estimate = b,
                std.error = se
            )
        }
    } else {
        if (!is.null(hyplab) && nrow(hyplab) == length(b)) {
            out <- cbind(
                hyplab,
                data.frame(
                    estimate = b,
                    std.error = se
                )
            )
        } else {
            out <- data.frame(
                term = paste0("b", seq_along(b)),
                estimate = b,
                std.error = se
            )
        }
    }

    # Remove std.error column when not computing st.errors and in bootstrap
    if (vcov_false) {
        out$std.error <- NULL
    }

    out[["group"]] <- attr(b, "grouplab")

    out <- get_ci(
        out,
        conf_level = conf_level,
        vcov = vcov,
        draws = draws,
        estimate = "estimate",
        hypothesis_null = hypothesis_null,
        hypothesis_direction = hypothesis_direction,
        df = df,
        model = model,
        ...
    )

    if (!is.null(equivalence)) {
        out <- equivalence(
            out,
            df = df,
            equivalence = equivalence
        )
    }

    out <- sort_columns(out)

    data.table::setDF(out)
    class(out) <- c("hypotheses", class(out))

    attr(out, "posterior_draws") <- draws
    attr(out, "model") <- model
    attr(out, "model_type") <- class(model)[1L]
    attr(out, "jacobian") <- J
    attr(out, "call") <- call_attr
    attr(out, "vcov") <- vcov
    attr(out, "vcov.type") <- vcov.type
    attr(out, "conf_level") <- conf_level
    attr(out, "hypothesis_function_by") <- attr(b, "hypothesis_function_by")

    # must be after attributes for vcov
    out <- multcomp_test(
        out,
        multcomp = multcomp,
        conf_level = conf_level,
        df = df
    )

    # Issue #1102: hypotheses() should not be called twice on the same object
    attr(out, "hypotheses_call") <- TRUE

    return(out)
}

Try the marginaleffects package in your browser

Any scripts or data that you put into this service are public.

marginaleffects documentation built on June 8, 2025, 12:44 p.m.