dnt: Non-central t-Distribution Density - Algorithms and...

dntR Documentation

Non-central t-Distribution Density - Algorithms and Approximations

Description

dntJKBf1 implements the summation formulas of Johnson, Kotz and Balakrishnan (1995), (31.15) on page 516 and (31.15') on p.519, the latter being typo-corrected for a missing factor 1 / j!.

dntJKBf() is Vectorize(dntJKBf1, c("x","df","ncp")), i.e., works vectorized in all three main arguments x, df and ncp.

The functions .dntJKBch1() and .dntJKBch() are only there for didactical reasons allowing to check that indeed formula (31.15') in the reference is missing a j! factor in the denominator.

The dntJKBf*() functions are written to also work with arbitrary precise numbers of class "mpfr" (from package Rmpfr) as arguments.

Usage


dntJKBf1(x, df, ncp, log = FALSE, M = 1000)
dntJKBf (x, df, ncp, log = FALSE, M = 1000)

## The "checking" versions, only for proving correctness of formula:
.dntJKBch1(x, df, ncp, log = FALSE, M = 1000, check=FALSE, tol.check = 1e-7)
.dntJKBch (x, df, ncp, log = FALSE, M = 1000, check=FALSE, tol.check = 1e-7)

Arguments

x, df, ncp

see R's dt(); note that each can be of class "mpfr".

log

as in dt(), a logical indicating if \log(f(x,*)) should be returned instead of f(x,*).

M

the number of terms to be used, a positive integer.

check

logical indicating if checks of the formula equalities should be done.

tol.check

tolerance to be used for all.equal() when check is true.

Details

How to choose M optimally has not been investigated yet and is probably also a function of the precision of the first three arguments (see getPrec from Rmpfr).

Note that relatedly, R's source code ‘R/src/nmath/dnt.c’ has claimed from 2003 till 2014 but wrongly that the noncentral t density f(x, *) was

    f(x, df, ncp) =
 	df^(df/2) * exp(-.5*ncp^2) /
 	(sqrt(pi)*gamma(df/2)*(df+x^2)^((df+1)/2)) *
 	   sum_{k=0}^Inf  gamma((df + k + df)/2)*ncp^k / prod(1:k)*(2*x^2/(df+x^2))^(k/2) .

These functions (and this help page) prove that it was wrong.

Value

a number for dntJKBf1() and .dntJKBch1().

a numeric vector of the same length as the maximum of the lengths of x, df, ncp for dntJKBf() and .dntJKBch().

Author(s)

Martin Maechler

References

Johnson, N.L., Kotz, S. and Balakrishnan, N. (1995) Continuous Univariate Distributions Vol~2, 2nd ed.; Wiley; chapter 31, Section 5 Distribution Function, p.514 ff

See Also

R's dt; (an improved version of) Viechtbauer's proposal: dtWV.

Examples

tt <-  seq(0, 10, length.out = 21)
ncp <- seq(0,  6, length.out = 31)
dt3R   <- outer(tt, ncp, dt,     df = 3)
dt3JKB <- outer(tt, ncp, dntJKBf, df = 3)
all.equal(dt3R, dt3JKB) # Lnx(64-b): 51 NA's in dt3R

x <- seq(-1,12, by=1/16)
fx <- dt(x, df=3, ncp=5)
re1 <- 1 - .dntJKBch(x, df=3, ncp=5) / fx ; summary(warnings()) # slow, with warnings
op <- options(warn = 2) # (=> warning == error, for now)
re2 <- 1 -  dntJKBf (x, df=3, ncp=5) / fx  # faster, no warnings
stopifnot(all.equal(re1[!is.na(re1)], re2[!is.na(re1)], tol=1e-6))
head( cbind(x, fx, re1, re2) , 20)
matplot(x, log10(abs(cbind(re1, re2))), type = "o", cex = 1/4)

## One of the numerical problems in "base R"'s non-central t-density:
options(warn = 0) # (factory def.)
x <- 2^seq(-12, 32, by=1/8) ; df <- 1/10
dtm <- cbind(dt(x, df=df,           log=TRUE),
             dt(x, df=df, ncp=df/2, log=TRUE),
             dt(x, df=df, ncp=df,   log=TRUE),
             dt(x, df=df, ncp=df*2, log=TRUE)) #.. quite a few warnings:
summary(warnings())
matplot(x, dtm, type="l", log = "x", xaxt="n",
        main = "dt(x, df=1/10, log=TRUE) central and noncentral")
sfsmisc::eaxis(1)
legend("right", legend=c("", paste0("ncp = df",c("/2","","*2"))),
       lty=1:4, col=1:4, bty="n")

(doExtras <- DPQ:::doExtras()) # TRUE e.g. if interactive()
(ncp <- seq(0, 12, by = if(doExtras) 3/4 else 2))
names(ncp) <- nnMs <- paste0("ncp=", ncp)
tt  <- seq(0, 5, by =  1)
dt3R <- outer(tt, ncp, dt,   df = 3)
if(requireNamespace("Rmpfr")) withAutoprint({
   mt  <- Rmpfr::mpfr(tt , 128)
   mcp <- Rmpfr::mpfr(ncp, 128)
   system.time(
       dt3M <- outer(mt, mcp, dntJKBf, df = 3,
                     M = if(doExtras) 1024 else 256)) # M=1024: 7 sec [10 sec on Winb]
   relE <- Rmpfr::asNumeric(sfsmisc::relErrV(dt3M, dt3R))
   relE[tt != 0, ncp != 0]
})

## all.equal(dt3R, dt3V, tol=0) # 1.2e-12

 # ---- using MPFR high accuracy arithmetic (too slow for routine testing) ---
## no such kink here:
x. <- if(requireNamespace("Rmpfr")) Rmpfr::mpfr(x, 256) else x
system.time(dtJKB <- dntJKBf(x., df=df, ncp=df, log=TRUE)) # 43s, was 21s and only 7s ???
lines(x, dtJKB, col=adjustcolor(3, 1/2), lwd=3)
options(op) # reset to prev.

## Relative Difference / Approximation errors :
plot(x, 1 - dtJKB / dtm[,3], type="l", log="x")
plot(x, 1 - dtJKB / dtm[,3], type="l", log="x", xaxt="n", ylim=c(-1,1)*1e-3); sfsmisc::eaxis(1)
plot(x, 1 - dtJKB / dtm[,3], type="l", log="x", xaxt="n", ylim=c(-1,1)*1e-7); sfsmisc::eaxis(1)
plot(x, abs(1 - dtJKB / dtm[,3]), type="l", log="xy", axes=FALSE, main =
     "dt(*, 1/10, 1/10, log=TRUE) relative approx. error",
     sub= paste("Copyright (C) 2019  Martin Maechler  --- ", R.version.string))
for(j in 1:2) sfsmisc::eaxis(j)


DPQ documentation built on Nov. 3, 2024, 3 a.m.