distributions: Probability Distributions as S3 Objects

Documented in cdf.ChiSquare ChiSquare log_pdf.ChiSquare pdf.ChiSquare quantile.ChiSquare random.ChiSquare support.ChiSquare

#' Create a Chi-Square distribution
#'
#' Chi-square distributions show up often in frequentist settings
#' as the sampling distribution of test statistics, especially
#' in maximum likelihood estimation settings.
#'
#' @param df Degrees of freedom. Must be positive.
#'
#' @return A `ChiSquare` object.
#' @export
#'
#' @family continuous distributions
#'
#' @section Transformations:
#'
#' A squared standard [Normal()] distribution is equivalent to a
#' \eqn{\chi^2_1} distribution with one degree of freedom. The
#' \eqn{\chi^2} distribution is a special case of the [Gamma()]
#' distribution with shape (TODO: check this) parameter equal
#' to a half. Sums of \eqn{\chi^2} distributions
#' are also distributed as \eqn{\chi^2} distributions, where the
#' degrees of freedom of the contributing distributions get summed.
#' The ratio of two \eqn{\chi^2} distributions is a [FisherF()]
#' distribution. The ratio of a [Normal()] and the square root
#' of a scaled [ChiSquare()] is a [StudentsT()] distribution.
#'
#' @details
#'
#'   We recommend reading this documentation on
#'   <https://alexpghayes.github.io/distributions3/>, where the math
#'   will render with additional detail and much greater clarity.
#'
#'   In the following, let \eqn{X} be a \eqn{\chi^2} random variable with
#'   `df` = \eqn{k}.
#'
#'   **Support**: \eqn{R^+}, the set of positive real numbers
#'
#'   **Mean**: \eqn{k}
#'
#'   **Variance**: \eqn{2k}
#'
#'   **Probability density function (p.d.f)**:
#'
#'   \deqn{
#'     f(x) = \frac{1}{\sqrt{2 \pi \sigma^2}} e^{-(x - \mu)^2 / 2 \sigma^2}
#'   }{
#'     f(x) = 1 / (2 \pi \sigma^2) exp(-(x - \mu)^2 / (2 \sigma^2))
#'   }
#'
#'   **Cumulative distribution function (c.d.f)**:
#'
#'   The cumulative distribution function has the form
#'
#'   \deqn{
#'     F(t) = \int_{-\infty}^t \frac{1}{\sqrt{2 \pi \sigma^2}} e^{-(x - \mu)^2 / 2 \sigma^2} dx
#'   }{
#'     F(t) = integral_{-\infty}^t 1 / (2 \pi \sigma^2) exp(-(x - \mu)^2 / (2 \sigma^2)) dx
#'   }
#'
#'   but this integral does not have a closed form solution and must be
#'   approximated numerically. The c.d.f. of a standard normal is sometimes
#'   called the "error function". The notation \eqn{\Phi(t)} also stands
#'   for the c.d.f. of a standard normal evaluated at \eqn{t}. Z-tables
#'   list the value of \eqn{\Phi(t)} for various \eqn{t}.
#'
#'   **Moment generating function (m.g.f)**:
#'
#'   \deqn{
#'     E(e^{tX}) = e^{\mu t + \sigma^2 t^2 / 2}
#'   }{
#'     E(e^(tX)) = e^(\mu t + \sigma^2 t^2 / 2)
#'   }
#'
#' @examples
#'
#' set.seed(27)
#'
#' X <- ChiSquare(5)
#' X
#'
#' mean(X)
#' variance(X)
#' skewness(X)
#' kurtosis(X)
#'
#' random(X, 10)
#'
#' pdf(X, 2)
#' log_pdf(X, 2)
#'
#' cdf(X, 4)
#' quantile(X, 0.7)
#'
#' cdf(X, quantile(X, 0.7))
#' quantile(X, cdf(X, 7))
ChiSquare <- function(df) {
  d <- data.frame(df = df)
  class(d) <- c("ChiSquare", "distribution")
  d
}

#' @export
mean.ChiSquare <- function(x, ...) {
  rlang::check_dots_used()
  rval <- x$df
  setNames(rval, names(x))
}


#' @export
variance.ChiSquare <- function(x, ...) {
  rval <- x$df * 2
  setNames(rval, names(x))
}

#' @export
skewness.ChiSquare <- function(x, ...) {
  rval <- sqrt(8 / x$df)
  setNames(rval, names(x))
}

#' @export
kurtosis.ChiSquare <- function(x, ...) {
  rval <- 12 / x$df
  setNames(rval, names(x))
}

#' Draw a random sample from a chi square distribution
#'
#' @inherit ChiSquare examples
#'
#' @param x A `ChiSquare` object created by a call to [ChiSquare()].
#' @param n The number of samples to draw. Defaults to `1L`.
#' @param drop logical. Should the result be simplified to a vector if possible?
#' @param ... Unused. Unevaluated arguments will generate a warning to
#'   catch mispellings or other possible errors.
#'
#' @return In case of a single distribution object or `n = 1`, either a numeric
#'   vector of length `n` (if `drop = TRUE`, default) or a `matrix` with `n` columns
#'   (if `drop = FALSE`).
#' @export
#'
random.ChiSquare <- function(x, n = 1L, drop = TRUE, ...) {
  n <- make_positive_integer(n)
  if (n == 0L) {
    return(numeric(0L))
  }
  FUN <- function(at, d) rchisq(n = at, df = d$df)
  apply_dpqr(d = x, FUN = FUN, at = n, type = "random", drop = drop)
}

#' Evaluate the probability mass function of a chi square distribution
#'
#' @inherit ChiSquare examples
#'
#' @param d A `ChiSquare` object created by a call to [ChiSquare()].
#' @param x A vector of elements whose probabilities you would like to
#'   determine given the distribution `d`.
#' @param drop logical. Should the result be simplified to a vector if possible?
#' @param elementwise logical. Should each distribution in \code{d} be evaluated
#'   at all elements of \code{x} (\code{elementwise = FALSE}, yielding a matrix)?
#'   Or, if \code{d} and \code{x} have the same length, should the evaluation be
#'   done element by element (\code{elementwise = TRUE}, yielding a vector)? The
#'   default of \code{NULL} means that \code{elementwise = TRUE} is used if the
#'   lengths match and otherwise \code{elementwise = FALSE} is used.
#' @param ... Arguments to be passed to \code{\link[stats]{dchisq}}.
#'   Unevaluated arguments will generate a warning to catch mispellings or other
#'   possible errors.
#'
#' @return In case of a single distribution object, either a numeric
#'   vector of length `probs` (if `drop = TRUE`, default) or a `matrix` with
#'   `length(x)` columns (if `drop = FALSE`). In case of a vectorized distribution
#'   object, a matrix with `length(x)` columns containing all possible combinations.
#' @export
#'
pdf.ChiSquare <- function(d, x, drop = TRUE, elementwise = NULL, ...) {
  FUN <- function(at, d) dchisq(x = at, df = d$df, ...)
  apply_dpqr(d = d, FUN = FUN, at = x, type = "density", drop = drop, elementwise = elementwise)
}

#' @rdname pdf.ChiSquare
#' @export
#'
log_pdf.ChiSquare <- function(d, x, drop = TRUE, elementwise = NULL, ...) {
  FUN <- function(at, d) dchisq(x = at, df = d$df, log = TRUE)
  apply_dpqr(d = d, FUN = FUN, at = x, type = "logLik", drop = drop, elementwise = elementwise)
}

#' Evaluate the cumulative distribution function of a chi square distribution
#'
#' @inherit ChiSquare examples
#'
#' @param d A `ChiSquare` object created by a call to [ChiSquare()].
#' @param x A vector of elements whose cumulative probabilities you would
#'   like to determine given the distribution `d`.
#' @param drop logical. Should the result be simplified to a vector if possible?
#' @param elementwise logical. Should each distribution in \code{d} be evaluated
#'   at all elements of \code{x} (\code{elementwise = FALSE}, yielding a matrix)?
#'   Or, if \code{d} and \code{x} have the same length, should the evaluation be
#'   done element by element (\code{elementwise = TRUE}, yielding a vector)? The
#'   default of \code{NULL} means that \code{elementwise = TRUE} is used if the
#'   lengths match and otherwise \code{elementwise = FALSE} is used.
#' @param ... Arguments to be passed to \code{\link[stats]{pchisq}}.
#'   Unevaluated arguments will generate a warning to catch mispellings or other
#'   possible errors.
#'
#' @return In case of a single distribution object, either a numeric
#'   vector of length `probs` (if `drop = TRUE`, default) or a `matrix` with
#'   `length(x)` columns (if `drop = FALSE`). In case of a vectorized distribution
#'   object, a matrix with `length(x)` columns containing all possible combinations.
#' @export
#'
cdf.ChiSquare <- function(d, x, drop = TRUE, elementwise = NULL, ...) {
  FUN <- function(at, d) pchisq(q = at, df = d$df, ...)
  apply_dpqr(d = d, FUN = FUN, at = x, type = "probability", drop = drop, elementwise = elementwise)
}

#' Determine quantiles of a chi square distribution
#'
#' `quantile()` is the inverse of `cdf()`.
#'
#' @inherit ChiSquare examples
#' @inheritParams random.ChiSquare
#'
#' @param probs A vector of probabilities.
#' @param drop logical. Should the result be simplified to a vector if possible?
#' @param elementwise logical. Should each distribution in \code{x} be evaluated
#'   at all elements of \code{probs} (\code{elementwise = FALSE}, yielding a matrix)?
#'   Or, if \code{x} and \code{probs} have the same length, should the evaluation be
#'   done element by element (\code{elementwise = TRUE}, yielding a vector)? The
#'   default of \code{NULL} means that \code{elementwise = TRUE} is used if the
#'   lengths match and otherwise \code{elementwise = FALSE} is used.
#' @param ... Arguments to be passed to \code{\link[stats]{qchisq}}.
#'   Unevaluated arguments will generate a warning to catch mispellings or other
#'   possible errors.
#'
#' @return In case of a single distribution object, either a numeric
#'   vector of length `probs` (if `drop = TRUE`, default) or a `matrix` with
#'   `length(probs)` columns (if `drop = FALSE`). In case of a vectorized
#'   distribution object, a matrix with `length(probs)` columns containing all
#'   possible combinations.
#' @export
#'
quantile.ChiSquare <- function(x, probs, drop = TRUE, elementwise = NULL, ...) {
  # TODO: in the documentation, more information on return and
  # how quantiles are calculated
  FUN <- function(at, d) qchisq(at, df = d$df, ...)
  apply_dpqr(d = x, FUN = FUN, at = probs, type = "quantile", drop = drop, elementwise = elementwise)
}


#' Return the support of the ChiSquare distribution
#'
#' @param d An `ChiSquare` object created by a call to [ChiSquare()].
#' @param drop logical. Should the result be simplified to a vector if possible?
#' @param ... Currently not used.
#'
#' @return A vector of length 2 with the minimum and maximum value of the support.
#'
#' @export
support.ChiSquare <- function(d, drop = TRUE, ...) {
  rlang::check_dots_used()
  min <- rep(0, length(d))
  max <- rep(Inf, length(d))
  make_support(min, max, d, drop = drop)
}

#' @exportS3Method
is_discrete.ChiSquare <- function(d, ...) {
  rlang::check_dots_used()
  setNames(rep.int(FALSE, length(d)), names(d))
}

#' @exportS3Method
is_continuous.ChiSquare <- function(d, ...) {
  rlang::check_dots_used()
  setNames(rep.int(TRUE, length(d)), names(d))
}

alexpghayes/distributions documentation built on Sept. 23, 2024, 4:49 p.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

alexpghayes/distributions
Probability Distributions as S3 Objects

R/ChiSquare.R
In alexpghayes/distributions: Probability Distributions as S3 Objects

Defines functions is_continuous.ChiSquare is_discrete.ChiSquare support.ChiSquare quantile.ChiSquare cdf.ChiSquare log_pdf.ChiSquare pdf.ChiSquare random.ChiSquare kurtosis.ChiSquare skewness.ChiSquare variance.ChiSquare mean.ChiSquare ChiSquare

Documented in cdf.ChiSquare ChiSquare log_pdf.ChiSquare pdf.ChiSquare quantile.ChiSquare random.ChiSquare support.ChiSquare

R Package Documentation

Browse R Packages

We want your feedback!

alexpghayes/distributions Probability Distributions as S3 Objects

R/ChiSquare.R In alexpghayes/distributions: Probability Distributions as S3 Objects

Defines functions is_continuous.ChiSquare is_discrete.ChiSquare support.ChiSquare quantile.ChiSquare cdf.ChiSquare log_pdf.ChiSquare pdf.ChiSquare random.ChiSquare kurtosis.ChiSquare skewness.ChiSquare variance.ChiSquare mean.ChiSquare ChiSquare

Documented in cdf.ChiSquare ChiSquare log_pdf.ChiSquare pdf.ChiSquare quantile.ChiSquare random.ChiSquare support.ChiSquare

R Package Documentation

Browse R Packages

We want your feedback!

alexpghayes/distributions
Probability Distributions as S3 Objects

R/ChiSquare.R
In alexpghayes/distributions: Probability Distributions as S3 Objects