intervalpdf: Probability density function of an observed interval...
In adokter/intRvals: Analysis of Time-Ordered Event Data with Missed Observations

intervalpdf

R Documentation

Probability density function of an observed interval distribution

Description

Observed intervals are assumed to be sampled through observation of continuous distinct arrivals in time. Two subsequently observed arrivals mark the start and end of an interval. The probability that an arrival is not observed can be nonzero, leading to observed intervals at integer multiples of the true interval.

Usage

intervalpdf(
  data = seq(0, 1000),
  mu = 200,
  sigma = 40,
  p = 0.3,
  N = 5L,
  fun = "gamma",
  trunc = c(0, Inf),
  fpp = 0,
  sigma.within = NA
)

Arguments

`data`	A list of intervals for which to calculate the probability density
`mu`	The mean of the true interval distribution
`sigma`	The standard deviation of the true interval distribution
`p`	The probability that an arrival that marks the start or end of an interval is not observed
`N`	The maximum number of consecutive missed arrivals to take into consideration
`fun`	assumed distribution family of the true interval distribution, one of "`normal`" or "`gamma`", corresponding to the Normal and GammaDist distributions.
`trunc`	Use a truncated probability density function with range `trunc`
`fpp`	Baseline proportion of intervals distributed as a random poisson process with mean arrival rate `mu`
`sigma.within`	within-subject standard deviation, only available when `fun` is "normal"

Details

General

intervals x are assumed to follow a standard distribution (either a normal or gamma distribution) with probability density function φ(x|μ,σ) with μ the mean arrival interval and σ its associated standard deviation. The probability density function φ_{obs} of observed arrival intervals in a scenario where the probability to not observe an arrival is nonzero, will be a superposition of several standard distributions, at multiples of the fundamental mean arrival interval. Standard distribution i will correspond to those intervals where i arrivals have been missed consecutively. If p equals this probability of not observing an arrival, then the probability P(i) to miss i consecutive arrivals equals

P(i)=p^i-p^{i+1}

The width of standard distribution i will be broadened relative to the fundamental, according to standard uncertainty propagation in the case of addition. Both in the case of normal and gamma-distributed intervals (see next subsections) we may write for the observed probability density function, φ_{obs}:

φ_{obs}(x | μ, σ, p)=∑_{i=1}^∞ φ_{obs}(x,i | μ,σ,p)

with

φ_{obs}(x,i | μ, σ, p)= P(i-1) φ(x | i μ,√ i σ)

In practice, this probability density function is well approximate when the infinite sum is capped at a finite integer N. Be default the sum is ran up to N=5.

Gamma-distributed intervals

By default intervals x are assumed to follow a Gamma (GammaDist) distribution Gamma(μ,σ)~dgamma(shape=μ^2/σ^2, scale=σ^2/μ) with a probability density function φ(x):

φ(x|μ,σ)~Gamma(μ,σ)

which has a mean μ and standard deviation σ.

Normal-distributed intervals

intervals x may also be assumed to follow a Normal distribution N(μ,σ)~dnorm(mean=μ,sd=σ), with a probability density function φ(x):

φ(x|μ,σ)~N(μ,σ)

which also has a mean μ and standard deviation σ. Because intervals are by definition non-negative, the Normal distribution is always truncated at zero. In the limit that μ>σ the gamma distribution tends to the normal distribution.

Within and between-subject variation

To account for witin-subject and between-subject differences in mean interval length we define σ_w as within-subject standard deviation in interval length, and σ_b as between-subject standard deviation in interval length, with σ^2 = σ^2_b + σ^2_w. In the normal limit (μ>σ) the population pdf will be a convolution between φ(x|μ,σ_b) and φ(x|μ,σ_w) equal to:

φ_{obs}(x | μ,σ,p)=∑_{i=1}^∞ P(i-1) φ(x | i μ,√{i {σ_w}^2 + σ^2})

Value

This function returns a list of points describing the interval distribution

Examples

# a low probability of not observing an arrival
# results in an observed PDF with primarily
# a single peak, with a mean and standard
# deviation almost identical to the true interval
# distribution:
plot(intervalpdf(mu=200,sigma=40,p=0.01),type='l',col='red')

# a higher probability to miss an arrival
# results in an observed PDF with multiple
# peaks at integer multiples of the mean of the true
# interval distribution
plot(intervalpdf(mu=200,sigma=40,p=0.4),type='l',col='red')

adokter/intRvals documentation built on May 4, 2022, 10:40 p.m.