estinterval: Estimate interval model accounting for missed arrival...

View source: R/DroppingInterval.R

estintervalR Documentation

Estimate interval model accounting for missed arrival observations

Description

Estimate interval mean and variance accounting for missed arrival observations, by fitting the probability density function intervalpdf to the interval data.

Usage

estinterval(
  data,
  mu = median(data),
  sigma = sd(data)/2,
  p = 0.2,
  N = 5L,
  fun = "gamma",
  trunc = c(0, Inf),
  fpp = (if (fpp.method == "fixed") 0 else 0.1),
  fpp.method = "auto",
  p.method = "auto",
  conf.level = 0.9,
  group = NA,
  sigma.within = NA,
  iter = 10,
  tol = 0.001,
  silent = F,
  ...
)

Arguments

data

A numeric list of intervals.

mu

Start value for the numeric optimization for the mean arrival interval.

sigma

Start value for the numeric optimization for the standard deviation of the arrival interval.

p

Start value for the numeric optimization for the probability to not observe an arrival.

N

Maximum number of missed observations to be taken into account (default N=5).

fun

Assumed distribution for the intervals, one of "normal" or "gamma", corresponding to the Normal and GammaDist distributions

trunc

Use a truncated probability density function with range trunc

fpp

Baseline proportion of intervals distributed as a random poisson process with mean arrival interval mu

fpp.method

A string equal to 'fixed' or 'auto'. When 'auto' fpp is optimized as a free model parameter, in which case fpp is taken as start value in the optimisation

p.method

A string equal to 'fixed' or 'auto'. When 'auto' p is optimized as a free model parameter, in which case p is taken as start value in the optimisation

conf.level

Confidence level for deviance test that checks whether model with nonzero missed event probability p significantly outperforms a model without a missed event probability (p=0).

group

optional vector of equal length as data, indicating the group or subject in which the interval was observed

sigma.within

optional within-subject standard deviation. When equal to default 'NA', assumes no additional between-subject effect, with sigma.within equal to sigma. When equal to 'auto' an estimate is provided by iteratively calling partition

iter

maximum number of iterations in numerical iteration for sigma.within

tol

tolerance in the iteration, when sigma.within changes less than this value in one iteration step, the optimization is considered converged.

silent

logical. When TRUE print no information to console

...

Additional arguments to be passed to optim

Details

The probability density function for observed intervals intervalpdf is fit to data by maximization of the associated log-likelihood using optim.

Within-group variation sigma.within may be separated from the total variation sigma in an iterative fit of intervalpdf on the interval data. In the iteration partition is used to (1) determine which intervals according to the fit are a fundamental interval at a confidence level conf.level, and (2) to partition the within-group variation from the total variation in interval length.

Within- and between-group variation is estimated on the subset of fundamental intervals with repeated measures only. As the set of fundamental interval depends on the precise value of sigma.within, the fit of intervalpdf and the subsequent estimation of sigma.within using partition is iterated until both converge to a stable solution. Parameters tol and iter set the threshold for convergence and the maximum number of iterations.

We note that an exponential interval model can be fitted by setting fpp=1 and fpp.method=fixed.

Value

This function returns an object of class intRvals, which is a list containing the following:

data

the interval data

mu

the modelled mean interval

mu.se

the modelled mean interval standard error

sigma

the modelled interval standard deviation

p

the modelled probability to not observe an arrival

fpp

the modelled fraction of arrivals following a random poisson process, see intervalpdf

N

the highest number of consecutive missed arrivals taken into account, see intervalpdf

convergence

convergence field of optim

counts

counts field of optim

loglik

vector of length 2, with first element the log-likelihood of the fitted model, and second element the log-likelihood of the model without a missed event probability (i.e. p=0)

df.residual

degrees of freedom, a 2-vector (1, number of intervals - n.param)

n.param

number of optimized model parameters

p.chisq

p value for a likelihood-ratio test of a model including a miss probability relative against a model without a miss probability

distribution

assumed interval distribution, one of 'gamma' or 'normal'

trunc

interval range over which the interval pdf was truncated and normalized

fpp.method

A string equal to 'fixed' or 'auto'. When 'auto' fpp has been optimized as a free model parameter

p.method

A string equal to 'fixed' or 'auto'. When 'auto' p has been optimized as a free model parameter

Examples

data(goosedrop)
# calculate mean and standard deviation of arrival intervals, accounting for missed observations:
dr=estinterval(goosedrop$interval)
# plot some summary information
summary(dr)
# plot a histogram of the intervals and fit:
plot(dr)
# test whether the mean arrival interval is greater than 200 seconds:
ttest(dr,mu=200,alternative="greater")

# let's estimate mean and variance of dropping intervals by site
# (schiermonnikoog vs terschelling) for time period 5.
# first prepare the two datasets:
set1=goosedrop[goosedrop$site=="schiermonnikoog" & goosedrop$period == 5,]
set2=goosedrop[goosedrop$site=="terschelling"  & goosedrop$period == 5,]
# allowing a fraction of intervals to be distributed randomly (fpp='auto')
dr1=estinterval(set1$interval,fpp.method='auto')
dr2=estinterval(set2$interval,fpp.method='auto')
# plot the fits:
plot(dr1,xlim=c(0,1000))
plot(dr2,xlim=c(0,1000))
# mean dropping interval are not significantly different
# at the two sites (on a 0.95 confidence level):
ttest(dr1,dr2)
# now compare this test with a t-test not accounting for unobserved intervals:
t.test(set1$interval,set2$interval)
# not accounting for missed observations leads to a (spurious)
# larger difference in means, which also increases
# the apparent statistical significance of the difference between means

intRvals documentation built on May 3, 2022, 1:07 a.m.