bandwidth: Calculate the optimal bandwidth matrix of movement data

View source: R/bandwidth.R

bandwidthR Documentation

Calculate the optimal bandwidth matrix of movement data

Description

This function calculates the optimal bandwidth matrix (kernel covariance) for a two-dimensional animal tracking dataset, given an autocorrelated movement model (Fleming et al, 2015). This optimal bandwidth can fully take into account all autocorrelation in the data, assuming it is captured by the movement model.

Usage

bandwidth(data,CTMM,VMM=NULL,weights=FALSE,fast=NULL,dt=NULL,PC="Markov",error=0.01,
          precision=1/2,verbose=FALSE,trace=FALSE,dt.plot=TRUE,...)

Arguments

data

2D timeseries telemetry data represented as a telemetry object.

CTMM

A ctmm movement model as from the output of ctmm.fit.

VMM

An optional vertical ctmm object for 3D bandwidth calculation.

weights

By default, the weights are taken to be uniform, whereas weights=TRUE will optimize the weights.

fast

Use FFT algorithms for weight optimization. fast=NULL will attempt to intelligently decide between the fast and exact algorithms based on computational complexity.

dt

Optional lag bin width for the FFT algorithm.

PC

Preconditioner to use: can be "Markov", "circulant", "IID", or "direct".

error

Maximum grid error for FFT algorithm, if dt is not specified.

precision

Fraction of maximum possible digits of precision to target in weight optimization. precision=1/2 results in about 7 decimal digits of precision if the preconditioner is stable.

verbose

Optionally return the optimal weights, effective sample size DOF.H, and other information along with the bandwidth matrix H.

trace

Produce tracing information on the progress of weight optimization.

dt.plot

Execute a diagnostic dt.plot with a red line at dt, if weights=TRUE.

...

Additional arguments not currently used.

Details

The weights=TRUE argument can be used to correct temporal sampling bias caused by autocorrelation. weights=TRUE will optimize n=length(data$t) weights via constrained & preconditioned conjugate gradient algorithms. These algorithms have a few options that should be considered if the data are very irregular.

fast=TRUE is an approximation that discretizes the data with timestep dt and applies FFT algorithms, for a computational cost as low as O(n \log n) with only O(n) function evaluations. If no dt is specified, then a choice of dt will be automated with a message. If the data contain some very tiny time intervals, say 1 second among hourly sampled data, then the default dt setting can create an excessively high-resolution discretization of time, which will cause slowdown. In this case CTMM should contain a location-error model and dt should be increased to a larger fraction of the most-frequent sampling intervals. If the data are irregular (permitting gaps), then dt may need to be several times smaller than the median to avoid slow down. In this case, try setting trace=TRUE and decreasing dt below the median until the interations speed up and the number of feasibility assessments becomes less than O(n).

fast=FALSE uses exact time spacing and has a computational cost as low as O(n^2), including O(n^2) function evaluations. With PC="direct" this method will produce a result that is exact to within machine precision, but with a computational cost of O(n^3). fast=FALSE,PC='direct' is often the fastest method with small datasets, where n ≤ O(1,000), but scales terribly with larger datasets.

Value

Returns a bandwidth matrix object, which is to be the optimal covariance matrix of the individual kernels of the kernel density estimate.

Note

To obtain a bandwidth scalar representing the variance of each kernel, a ctmm object with isotropic=TRUE is required. In this case, bandwidth will return bandwidth matrix with identical variances along its diagonal. Note that forcing isotropic=TRUE will provide an inaccurate estimate for very eccentric distributions.

In v1.0.1 the default fast, dt, PC arguments depend on the sample size, with fast=FALSE, PC="Direct" for small sample sizes, fast=FALSE, PC="Markov" for moderate sample sizes, and fast=TRUE, PC="Markov" for large sample sizes, where dt is taken to be the integer fraction of the median sampling interval closest to the minimum sampling interval.

In v0.6.2 the default dt was increased form the minimum time difference to a small quantile no less than error times the median.

Author(s)

C. H. Fleming.

References

T. F. Chan, “An Optimal Circulant Preconditioner for Toeplitz Systems”, SIAM Journal on Scientific and Statistical Computing, 9:4, 766-771 (1988) doi: 10.1137/0909051.

D. Marcotte, “Fast variogram computation with FFT”, Computers and Geosciences 22:10, 1175-1186 (1996) doi: 10.1016/S0098-3004(96)00026-X.

C. H. Fleming, W. F. Fagan, T. Mueller, K. A. Olson, P. Leimgruber, J. M. Calabrese, “Rigorous home-range estimation with movement data: A new autocorrelated kernel-density estimator”, Ecology, 96:5, 1182-1188 (2015) doi: 10.1890/14-2010.1.

C. H. Fleming, D. Sheldon, W. F. Fagan, P. Leimgruber, T. Mueller, D. Nandintsetseg, M. J. Noonan, K. A. Olson, E. Setyawan, A. Sianipar, J. M. Calabrese, “Correcting for missing and irregular data in home-range estimation”, Ecological Applications, 28:4, 1003-1010 (2018) doi: 10.1002/eap.1704.

See Also

akde, ctmm.fit


ctmm documentation built on Nov. 4, 2022, 5:06 p.m.