R/fast.chisq.test.R

Defines functions fast.chisq.test

Documented in fast.chisq.test

# fast.chisq.test.R
#
# Author: Xuye Luo, Joe Song
# 
# Updated:
#
# December 20, 2025
#   Updated documentation
#
# December 11, 2025

#' @title Fast Zero-Tolerant Pearson's Chi-squared Test of Association
#'
#' @description Performs a fast zero-tolerant 
#' Pearson's chi-squared test 
#' \insertCite{pearson1900}{Upsilon} 
#' to evaluate association between observations
#' from two categorical variables.
#' 
#' @references 
#' \insertRef{pearson1900}{Upsilon}
#'
#' @inheritParams fast.upsilon.test
#' @inherit fast.upsilon.test note
#' 
#' @return A list with class \code{"htest"}
#'   containing the following components:
#' \item{statistic}{the value of chi-squared test statistic.}
#' \item{parameter}{the degrees of freedom.}
#' \item{p.value}{the \emph{p}-value of the test.}
#' \item{estimate}{Cramér's \emph{V} statistic representing the effect size.}
#' \item{method}{a character string indicating the method used.}
#' \item{data.name}{a character string giving the names of input data.}
#'
#' @examples
#' library("Upsilon")
#' weather <- c(
#'   "rainy", "sunny", "rainy", "sunny", "rainy"
#' )
#' mood <- c(
#'   "wistful", "upbeat", "upbeat", "upbeat", "wistful"
#' )
#' 
#' fast.chisq.test(weather, mood)
#' 
#' # The result is equivalent to: 
#' modified.chisq.test(table(weather, mood))
#' @importFrom stats pchisq
#' @export
fast.chisq.test <- function(x, y, log.p = FALSE) {
  
  method_name <- "Fast Pearson's Chi-squared test of independence"
  dname <- paste(deparse(substitute(x)), "and", deparse(substitute(y)))
  
  # Call C++ function (chisq_cpp.cpp)
  chisq_list <- chisq_cpp(as.factor(x), as.factor(y))
  
  statistic_val <- chisq_list$statistic
  n  <- as.numeric(chisq_list$n)
  nr <- as.numeric(chisq_list$nr)
  nc <- chisq_list$nc
  k  <- min(nr, nc)
  
  # Calculate Cramér's V
  estimate_val <- sqrt(statistic_val / (n * (k - 1)))
  parameter_val <- (nr - 1L) * (nc - 1L)
  
  p_val <- stats::pchisq(statistic_val, parameter_val, lower.tail = FALSE, log.p = log.p)
  
  # Set names for htest class standards
  names(statistic_val) <- "X-squared"
  names(estimate_val)  <- "Cram\uE9r's V"
  names(parameter_val) <- "df"
  
  structure(
    list(
      statistic = statistic_val,
      estimate  = estimate_val,
      parameter = parameter_val,
      p.value   = p_val,
      method    = method_name,
      data.name = dname,
      observed  = cbind(x, y)
    ),
    class = "htest"
  )
}

Try the Upsilon package in your browser

Any scripts or data that you put into this service are public.

Upsilon documentation built on March 7, 2026, 5:07 p.m.