pnt: Non-central t Probability Distribution - Algorithms and...

pntR Documentation

Non-central t Probability Distribution - Algorithms and Approximations

Description

Compute different approximations for the non-central t-Distribution cumulative probability distribution function.

Usage



pntR      (t, df, ncp, lower.tail = TRUE, log.p = FALSE,
           use.pnorm = (df > 4e5 ||
                        ncp^2 > 2*log(2)*1021), # .Machine$double.min.exp = -1022
                                          itrmax = 1000, errmax = 1e-12, verbose = TRUE)
pntR1     (t, df, ncp, lower.tail = TRUE, log.p = FALSE,
           use.pnorm = (df > 4e5 ||
                        ncp^2 > 2*log(2)*1021),
                                          itrmax = 1000, errmax = 1e-12, verbose = TRUE)

pntP94    (t, df, ncp, lower.tail = TRUE, log.p = FALSE,
                                          itrmax = 1000, errmax = 1e-12, verbose = TRUE)
pntP94.1  (t, df, ncp, lower.tail = TRUE, log.p = FALSE,
                                          itrmax = 1000, errmax = 1e-12, verbose = TRUE)

pnt3150   (t, df, ncp, lower.tail = TRUE, log.p = FALSE, M = 1000, verbose = TRUE)
pnt3150.1 (t, df, ncp, lower.tail = TRUE, log.p = FALSE, M = 1000, verbose = TRUE)

pntLrg    (t, df, ncp, lower.tail = TRUE, log.p = FALSE)

pntJW39   (t, df, ncp, lower.tail = TRUE, log.p = FALSE)
pntJW39.0 (t, df, ncp, lower.tail = TRUE, log.p = FALSE)







pntVW13 (t, df, ncp, lower.tail = TRUE, log.p = FALSE,
           keepS = FALSE, verbose = FALSE)

pntGST23_T6  (t, df, ncp, lower.tail = TRUE, log.p = FALSE,
              y1.tol = 1e-8, Mterms = 20, alt = FALSE, verbose = TRUE)
pntGST23_T6.1(t, df, ncp, lower.tail = TRUE, log.p = FALSE,
              y1.tol = 1e-8, Mterms = 20, alt = FALSE, verbose = TRUE)

## *Non*-asymptotic, (at least partly much) better version of R's Lenth(1998) algorithm
pntGST23_1(t, df, ncp, lower.tail = TRUE, log.p = FALSE,
           j0max = 1e4, # for now
           IxpqFUN = Ixpq,
           alt = FALSE, verbose = TRUE, ...)

Arguments

t

vector of quantiles (called q in pt(..)).

df

degrees of freedom (> 0, maybe non-integer). df = Inf is allowed.

ncp

non-centrality parameter \delta \ge 0; If omitted, use the central t distribution.

log, log.p

logical; if TRUE, probabilities p are given as log(p).

lower.tail

logical; if TRUE (default), probabilities are P[X \le x], otherwise, P[X > x].

use.pnorm

logical indicating if the pnorm() approximation of Abramowitz and Stegun (26.7.10) should be used, which is available as pntLrg().

The default corresponds to R pt()'s own behaviour (which is suboptimal).

itrmax

number of iterations / terms.

errmax

convergence bound for the iterations.

verbose

logical or integer determining the amount of diagnostic print out to the console.

M

positive integer specifying the number of terms to use in the series.

keepS

logical indicating if the function should return a list with component cdf and other informational elements, or just the CDF values directly (by default).

y1.tol

positive tolerance for warning if y:= t^2/(t^2 + df) is too close to 1 (as the formulas use 1/(1-y)).

Mterms

number of summation terms for pntGST23_T6().

j0max

experimental: large integer limiting the summation terms in pntGST23_1() .

IxpqFUN

the (scaled) incomplete beta function I_x(p,q) to be used; currently, it defaults to the Ixpq function derived from Nico Temme's Maple code for “Table 1” in Gil et al. (2023).

alt

logical specifying if and how log-scale should be used. Experimental and not-yet-tested.

...

further arguments passed to IxpqFUN().

Details

pntR1():

a pure R version of the (C level) code of R's own pt(), additionally giving more flexibility (via arguments use.pnorm, itrmax, errmax whose defaults here have been hard-coded in R's C code called by pt()).

This implements an improved version of the AS 243 algorithm from Lenth(1989);

R's help on non-central pt() says:

This computes the lower tail only, so the upper tail suffers from cancellation and a warning will be given when this is likely to be significant.

and (in ‘Note:’)

The code for non-zero ncp is principally intended to be used for moderate values of ncp: it will not be highly accurate, especially in the tails, for large values.

pntR():

the Vectorize()d version of pntR1().

pntP94(), pntP94.1():

New versions of pntR1(), pntR(); using the Posten (1994) algorithm. pntP94() is the Vectorize()d version of pntP94.1().

pnt3150(), pnt3150.1():

Simple inefficient but hopefully correct version of pntP94..() This is really a direct implementation of formula (31.50), p.532 of Johnson, Kotz and Balakrishnan (1995)

pntLrg():

provides the pnorm() approximation (to the non-central t) from Abramowitz and Stegun (26.7.10), p.949; which should be employed only for large df and/or ncp.

pntJW39.0():

use the Jennett & Welch (1939) approximation see Johnson et al. (1995), p. 520, after (31.26a). This is still fast for huge ncp but has wrong asymptotic tail for |t| \to \infty. Crucially needs b=b_chi(df).

pntJW39():

is an improved version of pntJW39.0(), using 1-b =b_chi(df, one.minus=TRUE) to avoid cancellation when computing 1 - b^2.

pntGST23_T6():

(and pntGST23_T6.1() for informational purposes only) use the Gil et al.(2023)'s approximation of their Theorem 6.

pntGST23_1():

implements Gil et al.(2023)'s direct pbeta() based formula (1), which is very close to Lenth's algorithm.

pntVW13():

use MM's R translation of Viktor Witkowský (2013)'s matlab implementation.

Value

a number for pntJKBf1() and .pntJKBch1().

a numeric vector of the same length as the maximum of the lengths of x, df, ncp for pntJKBf() and .pntJKBch().

Author(s)

Martin Maechler

References

Johnson, N.L., Kotz, S. and Balakrishnan, N. (1995) Continuous Univariate Distributions Vol~2, 2nd ed.; Wiley; chapter 31, Section 5 Distribution Function, p.514 ff

Lenth, R. V. (1989). Algorithm AS 243 — Cumulative distribution function of the non-central t distribution, JRSS C (Applied Statistics) 38, 185–189.

Abramowitz, M. and Stegun, I. A. (1972) Handbook of Mathematical Functions. New York: Dover; formula (26.7.10), p.949

Posten, Harry O. (1994) A new algorithm for the noncentral t distribution function, Journal of Statistical Computation and Simulation 51, 79–87; \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1080/00949659408811623")}.

– not yet implemented –
Chattamvelli, R. and Shanmugam, R. (1994) An enhanced algorithm for noncentral t-distribution, Journal of Statistical Computation and Simulation 49, 77–83. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1080/00949659408811561")}

– not yet implemented –
Akahira, Masafumi. (1995). A higher order approximation to a percentage point of the noncentral t distribution, Communications in Statistics - Simulation and Computation 24:3, 595–605; \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1080/03610919508813261")}

Michael Perakis and Evdokia Xekalaki (2003) On a Comparison of the Efficacy of Various Approximations of the Critical Values for Tests on the Process Capability Indices CPL, CPU, and Cpk, Communications in Statistics - Simulation and Computation 32, 1249–1264; \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1081/SAC-120023888")}

Witkovský, Viktor (2013) A Note on Computing Extreme Tail Probabilities of the Noncentral T Distribution with Large Noncentrality Parameter, Acta Universitatis Palackianae Olomucensis, Facultas Rerum Naturalium, Mathematica 52(2), 131–143.

Gil A., Segura J., and Temme N.M. (2023) New asymptotic representations of the noncentral t-distribution, Stud Appl Math. 151, 857–882; \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1111/sapm.12609")} ; acronym “GST23”.

See Also

pt, for R's version of non-central t probabilities.

Examples

tt <- seq(0, 10, len = 21)
ncp <- seq(0, 6, len = 31)
pt3R   <- outer(tt, ncp, pt, , df = 3)
pt3JKB <- outer(tt, ncp, pntR, df = 3)# currently verbose
stopifnot(all.equal(pt3R, pt3JKB, tolerance = 4e-15))# 64-bit Lnx: 2.78e-16


## Gil et al.(2023) -- Table 1 p.869
str(GST23_tab1 <- read.table(header=TRUE, text = "
 x     pnt_x_delta              Rel.accuracy   l_y   j_max
 5     0.7890745035061528e-20    0.20e-13    0.29178   254
 8     0.1902963697413609e-07    0.40e-12    0.13863   294
11     0.4649258368179092e-03    0.12e-09    0.07845   310
14     0.2912746016055676e-01    0.11e-07    0.04993   317
17     0.1858422833307925e-00    0.41e-06    0.03441   321
20     0.4434882973203470e-00    0.82e-05    0.02510   323"))

x1 <- c(5,8,11,14,17,20)
(p1  <- pt  (x1, df=10.3, ncp=20))
(p1R <- pntR(x1, df=10.3, ncp=20)) # verbose=TRUE  is default
all.equal(p1, p1R, tolerance=0) # 4.355452e-15 {on x86_64} as have *no* LDOUBLE on R level
stopifnot(all.equal(p1, p1R))
## NB: According to Gil et al., the first value (x=5) is really wrong
## p1.23 <- .. Gil et al., Table 1:
p1.23.11 <- pntGST23_T6(x1, df=10.3, ncp=20, Mterms = 11)
p1.23.20 <- pntGST23_T6(x1, df=10.3, ncp=20, Mterms = 20, verbose=TRUE)
                                        # ==> Mterms = 11 is good only for x=5
p1.23.50 <- pntGST23_T6(x1, df=10.3, ncp=20, Mterms = 50, verbose=TRUE)

x <- 4:40 ; df <- 10.3
ncp <- 20
p1     <- pt        (x, df=df, ncp=ncp)
pG1    <- pntGST23_1(x, df=df, ncp=ncp)
pG1.bR <- pntGST23_1(x, df=df, ncp=ncp,
                     IxpqFUN = \(x, l_x=.5-x+.5, p, q) Ixpq(x,l_x, p,q))
pG1.BR <- pntGST23_1(x, df=df, ncp=ncp,
                     IxpqFUN = \(x, l_x, p, q)   pbeta(x, p,q))
cbind(x, p1, pG1, pG1.bR, pG1.BR)
all.equal(pG1, p1,     tolerance=0) # 1.034 e-12
all.equal(pG1, pG1.bR, tolerance=0) # 2.497031 e-13
all.equal(pG1, pG1.BR, tolerance=0) # 2.924698 e-13
all.equal(pG1.BR,pG1.bR,tolerance=0)# 1.68644  e-13
stopifnot(exprs = {
    all.equal(pG1, p1,     tolerance = 4e-12)
    all.equal(pG1, pG1.bR, tolerance = 1e-12)
    all.equal(pG1, pG1.BR, tolerance = 1e-12)
  })

ncp <- 40 ## is  > 37.62 = "critical" for Lenth' algorithm

### --------- pntVW13() --------------------------------------------------
## length 1 arguments:
str(rr <- pntVW13(t = 1, df = 2, ncp = 3, verbose=TRUE, keepS=TRUE))
all.equal(rr$cdf, pt(1,2,3), tol = 0)#  "Mean relative difference: 4.956769e-12"
stopifnot( all.equal(rr$cdf, pt(1,2,3)) )

str(rr <- pntVW13(t = 1:19, df = 2,    ncp = 3,    verbose=TRUE, keepS=TRUE))
str(r2 <- pntVW13(t = 1,    df = 2:20, ncp = 3,    verbose=TRUE, keepS=TRUE))
str(r3 <- pntVW13(t = 1,    df = 2:20, ncp = 3:21, verbose=TRUE, keepS=TRUE))

pt1.10.5_T <- 4.34725285650591657e-5 # Ex. 7 of Witkovsky(2013)
pt1.10.5 <- pntVW13(1, 10, 5)
all.equal(pt1.10.5_T, pt1.10.5, tol = 0)# TRUE! (Lnx Fedora 40; 2024-07-04);
			# 3.117e-16 (Macbuilder R 4.4.0, macOS Ventura 13.3.1)
stopifnot(exprs = {
    identical(rr$cdf, r1 <- pntVW13(t = 1:19, df = 2, ncp = 3))
    identical(r1[1], pntVW13(1, 2, 3))
    identical(r1[7], pntVW13(7, 2, 3))
    all.equal(pt1.10.5_T, pt1.10.5, tol = 9e-16)# NB even tol=0 (64 Lnx)
})
## However, R' pt() is only equal for the very first
cbind(t = 1:19, pntVW = r1, pt = pt(1:19, 2,3))


DPQ documentation built on Sept. 11, 2024, 8:37 p.m.