pnchisqAppr: (Approximate) Probabilities of Non-Central Chi-squared...

pnchisqApprR Documentation

(Approximate) Probabilities of Non-Central Chi-squared Distribution

Description

Compute (approximate) probabilities for the non-central chi-squared distribution.

The non-central chi-squared distribution with df= n degrees of freedom and non-centrality parameter ncp = \lambda has density

f(x) = f_{n,\lambda}(x) = e^{-\lambda / 2} \sum_{r=0}^\infty\frac{(\lambda/2)^r}{r!}\, f_{n + 2r}(x)

for x \ge 0, where f_m(x) (= dchisq(x,m)) is the (central chi-squared density with m degrees of freedom; for more, see R's help page for pchisq.

  • R's own historical and current versions, but with more tuning parameters;

Historical relatively simple approximations listed in Johnson, Kotz, and Balakrishnan (1995):

  • Patnaik(1949)'s approximation to the non-central via central chi-squared. Is also the formula 26.4.27 in Abramowitz & Stegun, p.942. Johnson et al mention that the approximation error is O(1/\sqrt(\lambda)) for \lambda \to \infty.

  • Pearson(1959) is using 3 moments instead of 2 as Patnaik (to approximate via a central chi-squared), and therefore better than Patnaik for the right tail; further (in Johnson et al.), the approximation error is O(1/\lambda) for \lambda \to \infty.

  • Abdel-Aty(1954)'s “first approximation” based on Wilson-Hilferty via Gaussian (pnorm) probabilities, is partly wrongly cited in Johnson et al., p.463, eq.(29.61a).

  • Bol'shev and Kuznetzov (1963) concentrate on the case of small ncp \lambda and provide an “approximation” via central chi-squared with the same degrees of freedom df, but a modified q (‘x’); the approximation has error O(\lambda^3) for \lambda \to 0 and is from Johnson et al., p.465, eq.(29.62) and (29.63).

  • Sankaran(1959, 1963) proposes several further approximations base on Gaussian probabilities, according to Johnson et al., p.463. pnchisqSankaran_d() implements its formula (29.61d).

pnchisq():

an R implementation of R's own C pnchisq_raw(), but almost only up to Feb.27, 2004, long before the log.p=TRUE addition there, including logspace arithmetic in April 2014, its finish on 2015-09-01. Currently for historical reference only.

pnchisqV():

a Vectorize()d pnchisq.

pnchisqRC():

R's C implementation as of Aug.2019; but with many more options. Currently extreme cases tend to hang on Winbuilder (?)

pnchisqIT:

using C code “paralleling” R's own, returns a list containing the full vector of terms computed (and the resulting probability).

pnchisqT93:

pure R implementations of approximations when both q and ncp are large, by Temme(1993), from Johnson et al., p.467, formulas (29.71 a), and (29.71 b), using auxiliary functions pnchisqT93a() and pnchisqT93b() respectively, with adapted formulas for the log.p=TRUE cases.

pnchisq_ss():

based on ss(), shows pure R code to sum up the non-central chi-squared distribution value.

ss:

pure R function providing all the summation terms making up non-central chi-squared distribution values; terms computed carefully only using simple arithmetic plus exp() and log(), via cumsum and cumprod and possibly in log space.

pnchisqTerms:

pure R providing these terms directly, calling (central) pchisq(x, df + 2*(k-1)) for the whole length vector k <- 1:i.max.

ss2():

computes “statistics” on ss(), whereas

ss2.():

uses pnchisqIT()'s C code providing similar stats as ss2().

Usage



pnchisq          (q, df, ncp = 0, lower.tail = TRUE, 
                  cutOffncp = 80, itSimple = 110, errmax = 1e-12, reltol = 1e-11,
                  maxit = 10* 10000, verbose = 0, xLrg.sigma = 5)
pnchisqV(x, ..., verbose = 0)

pnchisqRC        (q, df, ncp = 0, lower.tail = TRUE, log.p = FALSE,
                  no2nd.call = FALSE,
                  cutOffncp = 80, small.ncp.logspace = small.ncp.logspaceR2015,
                  itSimple = 110, errmax = 1e-12,
                  reltol = 8 * .Machine$double.eps, epsS = reltol/2, maxit = 1e6,
                  verbose = FALSE)
pnchisqAbdelAty  (q, df, ncp = 0, lower.tail = TRUE, log.p = FALSE)
pnchisqBolKuz    (q, df, ncp = 0, lower.tail = TRUE, log.p = FALSE)
pnchisqPatnaik   (q, df, ncp = 0, lower.tail = TRUE, log.p = FALSE)
pnchisqPearson   (q, df, ncp = 0, lower.tail = TRUE, log.p = FALSE)
pnchisqSankaran_d(q, df, ncp = 0, lower.tail = TRUE, log.p = FALSE)
pnchisq_ss       (x, df, ncp = 0, lower.tail = TRUE, log.p = FALSE, i.max = 10000,
                  ssr = ss(x=x, df=df, ncp=ncp, i.max = i.max))

pnchisqT93  (q, df, ncp, lower.tail = TRUE, log.p = FALSE, use.a = q > ncp)
pnchisqT93.a(q, df, ncp, lower.tail = TRUE, log.p = FALSE)
pnchisqT93.b(q, df, ncp, lower.tail = TRUE, log.p = FALSE)

pnchisqTerms     (x, df, ncp,     lower.tail = TRUE, i.max = 1000)
ss   (x, df, ncp, i.max = 10000, useLv = !(expMin < -lambda && 1/lambda < expMax))
ss2  (x, df, ncp, i.max = 10000, eps = .Machine$double.eps)
ss2. (q, df, ncp = 0, errmax = 1e-12, reltol = 2 * .Machine$double.eps,
      maxit = 1e+05, eps = reltol, verbose = FALSE)

Arguments

x

numeric vector (of ‘quantiles’, i.e., abscissa values).

q

number ( ‘quantile’, i.e., abscissa value.)

df

degrees of freedom > 0, maybe non-integer.

ncp

non-centrality parameter \delta.

lower.tail, log.p

logical, see, e.g., pchisq().

i.max

number of terms in evaluation ...

ssr

an ss()-like list to be used in pnchisq_ss().

use.a

logical vector for Temme pnchisqT93*() formulas, indicating to use formula ‘a’ over ‘b’. The default is as recommended in the references, but they did not take into account log.p = TRUE situations.

cutOffncp

a positive number, the cutoff value for ncp at which the algorithm branches; has been hard coded to 80 in R.

itSimple

positive integer, the maximal number of “simple” iterations; has been hard coded to 110 in R.

errmax

absolute error tolerance.

reltol

convergence tolerance for relative error.

maxit

maximal number of iterations.

xLrg.sigma

positive number ...

no2nd.call

logical indicating if a 2nd call is made to the internal function pnchisq_raw(), swapping lower.tail.

small.ncp.logspace

logical vector or function, indicating if the logspace computations for “small” ncp (defined to fulfill ncp < cutOffncp !).

epsS

small positive number, the convergence tolerance of the ‘simple’ iterations; has been hard coded to 1e-15 in R.

verbose

logical or integer specifying if or how much the algorithm progress should be monitored.

...

further arguments passed from pnchisqV() to pnchisq().

useLv

logical indicating if logarithmic scale should be used for \lambda computations.

eps

convergence tolerance, a positive number.

Details

pnchisq_ss()

uses si <- ss(x, df, ..) to get the series terms, and returns 2*dchisq(x, df = df +2) * sum(si$s).

ss()

computes the terms needed for the expansion used in pnchisq_ss() only using arithmetic, exp() and log().

ss2()

computes some simple “statistics” about ss(..).

Value

ss()

returns a list with 3 components

s

the series

i1

location (in s[]) of the first change from 0 to positive.

max

(first) location of the maximal value in the series (i.e., which.max(s)).

Author(s)

Martin Maechler, from May 1999; starting from a post to the S-news mailing list by Ranjan Maitra (@ math.umbc.edu) who showed a version of our pchisqAppr.0() thanking Jim Stapleton for providing it.

References

Johnson, N.L., Kotz, S. and Balakrishnan, N. (1995) Continuous Univariate Distributions Vol 2, 2nd ed.; Wiley; chapter 29 Noncentral \chi^2-Distributions; notably Section 8 Approximations, p.461 ff.

Abramowitz, M. and Stegun, I. A. (1972) Handbook of Mathematical Functions. New York: Dover. https://en.wikipedia.org/wiki/Abramowitz_and_Stegun

See Also

pchisq and the wienergerm approximations for it: pchisqW() etc.

r_pois() and its plot function, for an aspect of the series approximations we use in pnchisq_ss().

Examples

## set of quantiles to use :
qq <- c(.001, .005, .01, .05, (1:9)/10, 2^seq(0, 10, by= 0.5))
## Take "all interesting" pchisq-approximation from our pkg :
pkg <- "package:DPQ"
pnchNms <- c(paste0("pchisq", c("V", "W", "W.", "W.R")),
             ls(pkg, pattern = "^pnchisq"))
pnchNms <- pnchNms[!grepl("Terms$", pnchNms)]
pnchF <- sapply(pnchNms, get, envir = as.environment(pkg))
str(pnchF)
ncps <- c(0, 1/8, 1/2)
pnchR <- as.list(setNames(ncps, paste("ncp",ncps, sep="=")))
for(i.n in seq_along(ncps)) {
  ncp <- ncps[i.n]
  pnF <- if(ncp == 0) pnchF[!grepl("chisqT93", pnchNms)] else pnchF
  pnchR[[i.n]] <- sapply(pnF, function(F)
            Vectorize(F, names(formals(F))[[1]])(qq, df = 3, ncp=ncp))
}
str(pnchR, max=2)
		 

## A case where the non-central P[] should be improved :
## First, the central P[] which is close to exact -- choosing df=2 allows
## truly exact values: chi^2 = Exp(1) !
opal <- palette()
palette(c("black", "red", "green3", "blue", "cyan", "magenta", "gold3", "gray44"))
cR  <- curve(pchisq   (x, df=2,        lower.tail=FALSE, log.p=TRUE), 0, 4000, n=2001)
cRC <- curve(pnchisqRC(x, df=2, ncp=0, lower.tail=FALSE, log.p=TRUE),
             add=TRUE, col=adjustcolor(2,1/2), lwd=10, lty=4, n=2001)
cR0 <- curve(pchisq   (x, df=2, ncp=0, lower.tail=FALSE, log.p=TRUE),
             add=TRUE, col=adjustcolor(3,1/2), lwd=12,        n=2001)
## cRC & cR0 jump to -Inf much too early:
pchisq(1492, df=2, ncp=0, lower.tail=FALSE,log.p=TRUE) #--> -Inf *wrongly*
##' list_() := smart "named list" constructor
list_ <- function(...)
   `names<-`(list(...), vapply(sys.call()[-1L], as.character, ""))
JKBfn <-list_(pnchisqPatnaik,
              pnchisqPearson,
              pnchisqAbdelAty,
              pnchisqBolKuz,
              pnchisqSankaran_d)
cl. <- setNames(adjustcolor(3+seq_along(JKBfn), 1/2), names(JKBfn))
lw. <- setNames(2+seq_along(JKBfn),                   names(JKBfn))
cR.JKB <- sapply(names(JKBfn), function(nmf) {
  curve(JKBfn[[nmf]](x, df=2, ncp=0, lower.tail=FALSE, log.p=TRUE),
        add=TRUE, col=cl.[[nmf]], lwd=lw.[[nmf]], lty=lw.[[nmf]], n=2001)$y
})
legend("bottomleft", c("pchisq", "pchisq.ncp=0", "pnchisqRC", names(JKBfn)),
       col=c(palette()[1], adjustcolor(2:3,1/2), cl.),
       lwd=c(1,10,12, lw.), lty=c(1,4,1, lw.))
palette(opal)# revert
##
## the problematic "jump" :
as.data.frame(cRC)[744:750,]
cRall <- cbind(as.matrix(as.data.frame(cR)), R0 = cR0$y, RC = cRC$y, cR.JKB)
cbind(r1492 <- local({ r1 <- cRall[cRall[,"x"] == 1492, ]; r1[1+c(0, order(r1[-1]))] }))
cbind(r1492[-1] - r1492[["y"]])
## ==> Patnaik, Pearson, BolKuz are perfect for this x.
all.equal(cRC, cR0, tol = 1e-15) # TRUE [for now]
if(.Platform$OS.type == "unix")
  ## verbose=TRUE  may reveal which branches of the algorithm are taken:
  pnchisqRC(1500, df=2, ncp=0, lower.tail=FALSE, log.p=TRUE, verbose=TRUE) #
  ## |-->  -Inf currently

## The *two*  principal cases (both lower.tail = {TRUE,FALSE} !), where
##  "2nd call"  happens *and* is currently beneficial :
dfs <- c(1:2, 5, 10, 20)
pL. <- pnchisqRC(.00001, df=dfs, ncp=0, log.p=TRUE, lower.tail=FALSE, verbose = TRUE)
pR. <- pnchisqRC(   100, df=dfs, ncp=0, log.p=TRUE,                   verbose = TRUE)
## R's own non-central version (specifying 'ncp'):
pL0 <- pchisq   (.00001, df=dfs, ncp=0, log.p=TRUE, lower.tail=FALSE)
pR0 <- pchisq   (   100, df=dfs, ncp=0, log.p=TRUE)
## R's *central* version, i.e., *not* specifying 'ncp' :
pL  <- pchisq   (.00001, df=dfs,        log.p=TRUE, lower.tail=FALSE)
pR  <- pchisq   (   100, df=dfs,        log.p=TRUE)
cbind(pL., pL, relEc = signif(1-pL./pL, 3), relE0 = signif(1-pL./pL0, 3))
cbind(pR., pR, relEc = signif(1-pR./pR, 3), relE0 = signif(1-pR./pR0, 3))

DPQ documentation built on July 18, 2025, 3:01 p.m.