crossGeneral: Computations of Boundary Crossing Probabilities for the...
In STAR: Spike Train Analysis with R

Description Usage Arguments Details Value Warning Note Author(s) References See Also Examples

Computes the distribution of the first passage time through an arbitrary (crossGeneral) or a "tight" (crossTight) boundary for a Wiener process. The method of Loader and Deely (1987) is used. A tight boundary is a boundary generating the tighest confidence band for the process (Kendall et al, 2007). Utility function and methods: mkTightBMtargetFct, print, summary, plot, lines, are also provided to use and explore the results.

crossGeneral(tMax = 1, h = 0.001, cFct, cprimeFct, bFct, withBounds = FALSE, Lplus)
crossTight(tMax = 1, h = 0.001, a = 0.3, b = 2.35, withBounds = TRUE, logScale = FALSE)
mkTightBMtargetFct(ci = 0.95, tMax = 1, h = 0.001, logScale = FALSE)
## S3 method for class 'FirstPassageTime'
print(x, ...)
## S3 method for class 'FirstPassageTime'
summary(object, digits, ...)
## S3 method for class 'FirstPassageTime'
plot(x, y, which = c("Distribution", "density"), xlab, ylab, ...)
## S3 method for class 'FirstPassageTime'
lines(x, which = c("Distribution", "density"), ...)

`tMax`	A positive `numeric`. The "time" during which the Wiener process is followed.
`h`	A positive `numeric`. The integration time step used for the numerical solution of the Volterra integral equation (see `details`).
`cFct`	A `function` defining the boundary to be crossed. The first argument of the function should be a "time" argument. If the first argument is a vector, the function should return a vector of the same length.
`cprimeFct`	A `function` defining time derivative of the boundary to be crossed. Needs to be specified only if a check of the sign of the kernel derivative (see `details`) is requested. The first argument of the function should be a "time" argument. If the first argument is a vector, the function should return a vector of the same length.
`bFct`	A `function`. The "b" function of Loader and Deely (1987). Does not need to be specified (i.e., can be missing) but can be used to improve convergence. The first argument of the function should be a "time" argument. If the first argument is a vector, the function should return a vector of the same length.
`withBounds`	A `logical`. Should bounds on the distribution be calculated? If yes, set it to `TRUE`, leave it to its default value, `FALSE`, otherwise.
`Lplus`	A `logical`. If bounds are requested (`withBounds=TRUE`) and if the sign of the time derivative of the kernel is known to be positive or null, set to `TRUE`, if it is known to be negative, set it to `FALSE`. If the sign is unknown, leave `Lplus` unspecified and provide a `cprimeFct` function.
`logScale`	A `logical`. Should intermediate calculations in `crossTight` be carried out on the log scale for numerical precision? If yes, set it to `TRUE`, leave it to its default, `FALSE`, otherwise.
`a,b`	`numerics`, the two parameters of the "tight" boundary: `c(t) = a + b*sqrt(t)`. See details.
`ci`	A `numeric` larger than 0 and smaller than 1. The nominal coverage probability desired for a "tight" confidence band (see `details`).
`x,object`	A `FirstPassageTime` object returned by `crossGeneral` or `crossTight`.
`y`	Not used but required for a `plot` method.
`which`	A `character` string, "`Distribution`" or "`density`", specifying if a probability distribution or a probability density should be graphed.
`xlab,ylab`	See `plot`.
`digits`	A positive `integer`. The number of digits to print in `summary`. If bounds were computed, the value of `digits` is computed internally based on the bounds width.
`...`	Used in `plot` and `lines` to pass further arguments (see `plot` and `lines`), not used in `print` and `summary`.

The Loader and Deely (1987) method to compute the probability G(t) that the first passage of a Wiener process / Brownian motion occurs between 0 and t (argument tMax of crossGeneral and crossTight) through a boundary defined by c(t) is based on the numerical solution of a Volterra integral equation of the first kind satisfied by G() and defined by their Eq. 2.2:

F(t) = \int_0^t K(t,u) dG(u)

where, F(t) is defined by:

F(t) = pnorm(-c(t)/sqrt(t)) + exp(-2*b(t)*(c(t)-t*b(t)))*pnorm((-c(t)+2*t*b(t))/sqrt(t))

K(t,u) is defined by:

K(t,u)=pnorm((c(u)-c(t))/sqrt(t-u)) + exp(-2*b(t)*(c(t)-c(u)-(t-u)*b(t)))*pnorm((c(u)-c(t)+2*(t-u)*b(t))/(sqrt(t-u)))

and b(t) is an additional function (that can be uniformly 0) that is chosen to improve convergence speed and to get error bounds. Argument h is the step size used in the numerical solution of the above Volterra integral equation. The mid-point method (Eq. 3.1 and 3.2 of Loader and Deely (1987)) is implemented. If tMax is not a multiple of h it is modified as follows: tMax <- round(tMax/h)*h.

crossGeneral generates functions F() and K(,) internally given c() (argument cFct) and b() (argument bFct). If bFct is not given (i.e., missing(bFct) returns TRUE) it is taken as uniformly 0. If a numeric is given for cFct then c() is defined as a uniform function returning the first element of the argument (cFct).

Function crossTight assumes the following functional form for c(): a + b * sqrt(t). b() is set to c'() (the derivative of c()). Arguments a and b of crossTight correspond to the 2 parameters of c().

If argument withBounds is set to TRUE then bounds on G() are computed. Function crossTight uses Eq. 3.6 and 3.7 of Loader and Deely (1987) to compute these bounds, Gl(t) and Gu(t). Function crossGeneral uses Eq. 3.6 and 3.7 (if argument Lplus is set to TRUE) or Eq. 3.10 and 3.11 (if argument Lplus is set to FALSE). Here Lplus stands for the sign of the partial derivative of the kernel K(,) with respect to its second argument. If the sign is not known the user can provide the derivative c'() of c() as argument cprimeFct. A (slow) numerical check is then performed to decide wether Lplus should be TRUE or FALSE or if it changes sign (in which case bounds cannot be obtained and an error is returned).

In function crossTight argument logScale controls the way some intermediate computations of the mid-point method are implemented. If set to FALSE (the default) a literal implementation of Eq. 3.2 of Loader and Deely (1987) is used. If set to TRUE then additions subtractions are computed on the log scale using functions logspace_add and logspace_sub of the R API. The computation is then slightly slower and it turns out that the gain in numerical precision is not really significant, so you can safely leave this argument to its default value.

crossGeneral and crossTight return a FirstPassageTime object which is a list with the following components:

`time`	A `numeric` vector of "times" at which the first passage time probability has been evaluated.
`G`	A `numeric` vector of first passage probability.
`Gu`	A `numeric` vector with the upper bound of first passage probability. Only if `withBounds` was set to `TRUE`.
`Gl`	A `numeric` vector with the lower bound of first passage probability. Only if `withBounds` was set to `TRUE`.
`mids`	A `numeric` vector of "times" at which the first passage time probability density has been evaluated. Mid points of component `time`.
`g`	A `numeric` vector of first passage probability density.
`h`	A `numeric`. The value of argument `h` of `crossGeneral` or `crossTight`.
`call`	The matched call.

mkTightBMtargetFct returns a function which can be used in optim. This function returns the square of the difference between (1-ci)/2 (remember the "symmetry" of the Wiener processes paths, that is, for every path there is a symmetric one with respect to the abscissa having with the same probability) and the probability to have the first passage time of the Wiener process smaller or equal to 1 when the boundary is the "tight" boundary defined by: a + b*sqrt(t). The function takes a single vector argument containing the log of the parameters a (vector's first element) and b (vector's second element).

Methods print.FirstPassageTime and summary.FirstPassageTime output the probability to observe the first exit between 0 and tMax. If bounds were computed, the precision on the probability is also returned (as an attribute for print.FirstPassageTime). summary.FirstPassageTime also gives the integration time step, h, used.

crossGeneral with withBounds = TRUE and a negative kernel derivative is presently poorly tested, so be careful and let me know if mistakes show up.

Using logScale = TRUE in crossTight seems to be an overkill, that is, it doubles computation's time without bringing significant numerical improvement.

crossGeneral is for now pure R code. The first passage probability is obtained by solving the lower triangular system (Eq. 3.1 of Loader and Deely (1987)) with forwardsolve and is therefore rather fast (but can be memory "hungry"). The bounds are computed by 2 nested loops and can therefore be long to compute.

crossTight is calling a C code and is fast.

Loader and Deely paper also describes a method where G(t) is solution of a Volterra integral equation of the second kind (their Eq. 2.7). This approach is presently not implemented in the above functions.

Christophe Pouzat christophe.pouzat@gmail.com

C. R. Loader and J. J. Deely (1987) Computations of Boundary Crossing Probabilities for the Wiener Process. J. Statist. Comput. Simul. 27: 95–105.

W. S. Kendall, J.- M. Marin and C. P. Robert (2007) Brownian Confidence Bands on Monte Carlo Output. Statistics and Computing 17: 1–10. Preprint available at: http://www.ceremade.dauphine.fr/%7Exian/kmr04.rev.pdf

print, summary, plot, lines, pinvgauss

## Not run: 
## Reproduce Table 1 (p 101) of Loader and Deely (1987)
## define a vector of n values
nLD <- c(8,16,32,64,128)

## Part 1: c(t) = sqrt(1+t) and tMax=1
## define cFct
cFT1p1 <- function(t) sqrt(1+t)
## define the different bFct
bFT1p1.ii <- function(t) 0.5/sqrt(1+t)
bFT1p1.iii <- function(t) (cFT1p1(t)-cFT1p1(0))/t 
bFT1p1.iv <- function(t) 0.5*(bFT1p1.ii(t)+bFT1p1.iii(t)) 
bFT1p1.v <- function(t) (2*t-4/5*((1+t)^2.5-1))/t^3+3*cFT1p1(t)/2/t
## Do the calculations
round(t(sapply(nLD,
               function(n) {
                 c(n=n,
                   i=crossGeneral(tMax=1,h=1/n,cFct=cFT1p1)$G[n],
                   ii=crossGeneral(tMax=1,h=1/n,cFct=cFT1p1,bFct=bFT1p1.ii)$G[n],
                   iii=crossGeneral(tMax=1,h=1/n,cFct=cFT1p1,bFct=bFT1p1.iii)$G[n],
                   iv=crossGeneral(tMax=1,h=1/n,cFct=cFT1p1,bFct=bFT1p1.iv)$G[n],
                   v=crossGeneral(tMax=1,h=1/n,cFct=cFT1p1,bFct=bFT1p1.v)$G[n])})),
      digits=6)

## Part 2: c(t) = exp(-t) and tMax=1
## define cFct
cFT1p2 <- function(t) exp(-t)
## define the different bFct
cFT1p2 <- function(t) exp(-t)
bFT1p2.ii <- function(t) -exp(-t)
bFT1p2.iii <- function(t) (cFT1p2(t)-cFT1p2(0))/t 
bFT1p2.iv <- function(t) 0.5*(bFT1p2.ii(t)+bFT1p2.iii(t)) 
bFT1p2.v <- function(t) 3*(1-t-exp(-t))/t^3+3*cFT1p2(t)/2/t
## Do the calculations
round(t(sapply(nLD,
               function(n) {
                 c(n=n,
                   i=crossGeneral(tMax=1,h=1/n,cFct=cFT1p2)$G[n],
                   ii=crossGeneral(tMax=1,h=1/n,cFct=cFT1p2,bFct=bFT1p2.ii)$G[n],
                   iii=crossGeneral(tMax=1,h=1/n,cFct=cFT1p2,bFct=bFT1p2.iii)$G[n],
                   iv=crossGeneral(tMax=1,h=1/n,cFct=cFT1p2,bFct=bFT1p2.iv)$G[n],
                   v=crossGeneral(tMax=1,h=1/n,cFct=cFT1p2,bFct=bFT1p2.v)$G[n])})),
      digits=6)

## Part 3: c(t) = t^2 + 3*t + 1 and tMax=1
## define cFct
cFT1p3 <- function(t) t^2+3*t+1
## define the different bFct
bFT1p3.ii <- function(t) 2*t+3
bFT1p3.iii <- function(t) (cFT1p3(t)-cFT1p3(0))/t 
bFT1p3.v <- function(t) 5*t/4+3
bFT1p3.vi <- function(t) rep(3,length(t))
round(t(sapply(nLD,
               function(n) {
                 c(n=n,
                   i=crossGeneral(tMax=1,h=1/n,cFct=cFT1p3)$G[n],
                   ii=crossGeneral(tMax=1,h=1/n,cFct=cFT1p3,bFct=bFT1p3.ii)$G[n],
                   iii=crossGeneral(tMax=1,h=1/n,cFct=cFT1p3,bFct=bFT1p3.iii)$G[n],
                   v=crossGeneral(tMax=1,h=1/n,cFct=cFT1p3,bFct=bFT1p3.v)$G[n],
                   vi=crossGeneral(tMax=1,h=1/n,cFct=cFT1p3,bFct=bFT1p3.vi)$G[n])})),
      digits=6)

## Part 3: c(t) = t^2 + 3*t + 1 and tMax=1
## define cFct
cFT1p4 <- function(t) 1/t
## Here only column (i) and (vii) are reproduced.
## If one attempts to reproduce (ii) directly with crossGeneral
## a NaN appears (when a -Inf is the correct value) in functions
## F() and K(,) (see details) for instance when t=0 in F.
## Then as crossGeneral is presently written R attempts to
## compute t*b(t), where b(t) is c'(t), that is, t*(-1/t^2) which is
## NaN (for R) when t=0.
bFT1p4.vii <- function(t) rep(-1,length(t))
round(t(sapply(nLD,
               function(n) {
                 c(n=n,
                   i=crossGeneral(tMax=1,h=1/n,cFct=cFT1p4)$G[n],
                   vii=crossGeneral(tMax=1,h=1/n,cFct=cFT1p4,bFct=bFT1p4.vii)$G[n])})),
      digits=6)
## The last 3 rows of column (vii) are not the same as in the paper

## Reproduce Table 4 (p 104) of Loader and Deely (1987)
## As before the probability of first passage between
## 0 and 1 is computed. This time only three boundary
## functions are considered. The error bounds are
## obtained

## Part 1: c(t) = sqrt(1+t)
## Left columns pair: b(t) = c'(t)
round(t(sapply(nLD,
               function(n) {
                 res <- crossGeneral(tMax=1,h=1/n,cFct=cFT1p1,bFct=bFT1p1.ii,withBounds=TRUE,Lplus=TRUE)
                 c(Gl=res$Gl[n],Gu=res$Gu[n])
               }
               )
         ),
       digits=5)

## Right columns pair: b(t) = 0
round(t(sapply(nLD,
               function(n) {
                 res <- crossGeneral(tMax=1,h=1/n,cFct=cFT1p1,withBounds=TRUE,Lplus=TRUE)
                 c(n=n,Gl=res$Gl[n],Gu=res$Gu[n])
               }
               )
         ),
       digits=5)

## Part 2: c(t) = t^2 + 3*t + 1
## Left columns pair: b(t) = 3 - 2*t
round(t(sapply(nLD,
               function(n) {
                 res <- crossGeneral(tMax=1,h=1/n,cFct=cFT1p3,bFct=function(t) 3-2*t,withBounds=TRUE,Lplus=TRUE)
                 c(n=n,Gl=res$Gl[n],Gu=res$Gu[n])
               }
               )
         ),
       digits=5)

## Right columns pair: b(t) = 3 - t
round(t(sapply(nLD,
               function(n) {
                 res <- crossGeneral(tMax=1,h=1/n,cFct=cFT1p3,bFct=function(t) 3-2*t,withBounds=TRUE,Lplus=TRUE)
                 c(n=n,Gl=res$Gl[n],Gu=res$Gu[n])
               }
               )
         ),
       digits=5)

## Part 3: c(t) = 1 + sin(t)
## Left columns pair: b(t) = c'(t)
round(t(sapply(nLD,
               function(n) {
                 res <- crossGeneral(tMax=1,h=1/n,cFct=function(t) 1+sin(t),bFct=function(t) cos(t),withBounds=TRUE,Lplus=TRUE)
                 c(n=n,Gl=res$Gl[n],Gu=res$Gu[n])
               }
               )
        ),
      digits=5)

## Left columns pair: b(t) = 0.5
round(t(sapply(nLD,
               function(n) {
                 res <- crossGeneral(tMax=1,h=1/n,cFct=function(t) 1+sin(t),bFct=function(t) rep(0.5,length(t)),withBounds=TRUE,Lplus=TRUE)
                 c(n=n,Gl=res$Gl[n],Gu=res$Gu[n])
               }
               )
        ),
      digits=5)


## Check crossGeneral against an analytical inverse Gaussian
## distribution
## Define inverse Gaussian parameters
mu.true <- 0.075
sigma2.true <- 3
## Define a function transforming the "drift" (mu.true) and
## "noise variance" (sigma2.true) of the default inverse
## Gaussian parametrization of STAR into a
## linear boundary of an equivalent Wiener process first
## passage time problem
star2ld <- function(mu,sigma2) c(sqrt(1/sigma2),-sqrt(1/sigma2)/mu)
## Get the "equivalent" boundary parameters (y intercept and slope)
parB1 <- star2ld(mu.true,sigma2.true)
## Plot the "target" inverse Gaussian density
xx <- seq(0.001,0.3,0.001)
plot(xx,dinvgauss(xx,mu=mu.true,sigma2=sigma2.true),type="l")
## Get the numerical estimate of the density using Loader and
## Deely Volterra integral equation method
igB1 <- crossGeneral(tMax=0.3,h=0.001,cFct=function(t) parB1[1]+parB1[2]*t,withBounds=FALSE)
## superpose the numerical estimate to the exact solution
## use lines method to do that
lines(igB1,"density",col=2)

## Use of crossTight and associated function
## Get the paramters, a and b, of the "approximate"
## tightest boundary: c(t) = a + b*sqrt(t), giving a 
## 0.05 probability of exit between 0 and 1
## (in fact we are discussing here a pair of symmetrical
## bounaries, c(t) and -c(t)). See Kendall et al (2007)
## for details
## Start by defining the target function
target95 <- mkTightBMtargetFct(ci=0.95)
## get the optimal log(a) and log(b) using
## the values of table 1 of Kendall et al as initial
## guesses
p95 <- optim(log(c(0.3,2.35)),target95,method="BFGS")
## check the convergence of BFGS
p95$convergence
## check if the parameters changed a lot
exp(p95$par)
## Get the bounds on G(1) for these optimal parameters
d95 <- crossTight(a=exp(p95$par[1]),b=exp(p95$par[2]),withBound=TRUE,logScale=FALSE)
## print out the summary
summary(d95)
## Do the same for the 0.01 probability of first passage
target99 <- mkTightBMtargetFct(ci=0.99)
p99 <- optim(p95$par,target99,method="BFGS")
p99$convergence
exp(p99$par)
d99 <- crossTight(a=exp(p99$par[1]),b=exp(p99$par[2]),withBound=TRUE,logScale=FALSE)
summary(d99) 

## End(Not run)