# bivariate.density: Bivariate kernel density/intensity estimation In tilmandavies/sparr: Spatial and Spatiotemporal Relative Risk

 bivariate.density R Documentation

## Bivariate kernel density/intensity estimation

### Description

Provides an isotropic adaptive or fixed bandwidth kernel density/intensity estimate of bivariate/planar/2D data.

### Usage

bivariate.density(
pp,
h0,
hp = NULL,
resolution = 128,
gamma.scale = "geometric",
edge = c("uniform", "diggle", "none"),
weights = NULL,
intensity = FALSE,
trim = 5,
xy = NULL,
pilot.density = NULL,
leaveoneout = FALSE,
parallelise = NULL,
verbose = TRUE
)

### Details

Given a data set x_1,\dots,x_n in 2D, the isotropic kernel estimate of its probability density function, \hat{f}(x), is given by

\hat{f}(y)=n^{-1}\sum_{i=1}^{n}h(x_i)^{-2}K((y-x_i)/h(x_i))

where h(x) is the bandwidth function, and K(.) is the bivariate standard normal smoothing kernel. Edge-correction factors (not shown above) are also implemented.

Fixed

The classic fixed bandwidth kernel estimator is used when adapt = FALSE. This amounts to setting h(u)=h0 for all u. Further details can be found in the documentation for density.ppp.

Setting adapt = TRUE requests computation of Abramson's (1982) variable-bandwidth estimator. Under this framework, we have h(u)=h0*min[\tilde{f}(u)^{-1/2},G*trim]/\gamma, where \tilde{f}(u) is a fixed-bandwidth kernel density estimate computed using the pilot bandwidth hp.

• Global smoothing of the variable bandwidths is controlled with the global bandwidth h0.

• In the above statement, G is the geometric mean of the “bandwidth factors” \tilde{f}(x_i)^{-1/2}; i=1,\dots,n. By default, the variable bandwidths are rescaled by \gamma=G, which is set with gamma.scale = "geometric". This allows h0 to be considered on the same scale as the smoothing parameter in a fixed-bandwidth estimate i.e. on the scale of the recorded data. You can use any other rescaling of h0 by setting gamma.scale to be any scalar positive numeric value; though note this only affects \gamma – see the next bullet. When using a scale-invariant h0, set gamma.scale = 1.

• The variable bandwidths must be trimmed to prevent excessive values (Hall and Marron, 1988). This is achieved through trim, as can be seen in the equation for h(u) above. The trimming of the variable bandwidths is universally enforced by the geometric mean of the bandwidth factors G independent of the choice of \gamma. By default, the function truncates bandwidth factors at five times their geometric mean. For stricter trimming, reduce trim, for no trimming, set trim = Inf.

• For even moderately sized data sets and evaluation grid resolution, adaptive kernel estimation can be rather computationally expensive. The argument davies.baddeley is used to approximate an adaptive kernel estimate by a sum of fixed bandwidth estimates operating on appropriate subsets of pp. These subsets are defined by “bandwidth bins”, which themselves are delineated by a quantile step value 0<\delta<1. E.g. setting \delta=0.05 will create 20 bandwidth bins based on the 0.05th quantiles of the Abramson variable bandwidths. Adaptive edge-correction also utilises the partitioning, with pixel-wise bandwidth bins defined using the value 0<\beta<1, and the option to decrease the resolution of the edge-correction surface for computation to a [L \times L] grid, where 0 <L \le resolution. If davies.baddeley is supplied as a vector of length 3, then the values [1], [2], and [3] correspond to the parameters \delta, \beta, and L_M=L_N in Davies and Baddeley (2018). If the argument is simply a single numeric value, it is used for both \delta and \beta, with L_M=L_N=resolution (i.e. no edge-correction surface coarsening).

• Computation of leave-one-out values (when leaveoneout = TRUE) is done by brute force, and is therefore very computationally expensive for adaptive smoothing. This is because the leave-one-out mechanism is applied to both the pilot estimation and the final estimation stages. Experimental code to do this via parallel processing using the foreach routine is implemented. Fixed-bandwidth leave-one-out can be performed directly in density.ppp.

### Value

If leaveoneout = FALSE, an object of class "bivden". This is effectively a list with the following components:

 z The resulting density/intensity estimate, a pixel image object of class im. h0 A copy of the value of h0 used. hp A copy of the value of hp used. h A numeric vector of length equal to the number of data points, giving the bandwidth used for the corresponding observation in pp. him A pixel image (class im), giving the ‘hypothetical’ Abramson bandwidth at each pixel coordinate conditional upon the observed data. NULL for fixed-bandwidth estimates. q Edge-correction weights; a pixel image if edge = "uniform", a numeric vector if edge = "diggle", and NULL if edge = "none". gamma The value of \gamma used in scaling the bandwidths. NA if a fixed bandwidth estimate is computed. geometric The geometric mean G of the untrimmed bandwidth factors \tilde{f}(x_i)^{-1/2}. NA if a fixed bandwidth estimate is computed. pp A copy of the ppp.object initially passed to the pp argument, containing the data that were smoothed.

Else, if leaveoneout = TRUE, simply a numeric vector of length equal to the number of data points, giving the leave-one-out value of the function at the corresponding coordinate.

### Author(s)

T.M. Davies and J.C. Marshall

### References

Abramson, I. (1982). On bandwidth variation in kernel estimates — a square root law, Annals of Statistics, 10(4), 1217-1223.

Davies, T.M. and Baddeley A. (2018), Fast computation of spatially adaptive kernel estimates, Statistics and Computing, 28(4), 937-956.

Davies, T.M. and Hazelton, M.L. (2010), Adaptive kernel estimation of spatial relative risk, Statistics in Medicine, 29(23) 2423-2437.

Davies, T.M., Jones, K. and Hazelton, M.L. (2016), Symmetric adaptive smoothing regimens for estimation of the spatial relative risk function, Computational Statistics & Data Analysis, 101, 12-28.

Diggle, P.J. (1985), A kernel method for smoothing point process data, Journal of the Royal Statistical Society, Series C, 34(2), 138-147.

Hall P. and Marron J.S. (1988) Variable window width kernel density estimates of probability densities. Probability Theory and Related Fields, 80, 37-49.

Marshall, J.C. and Hazelton, M.L. (2010) Boundary kernels for adaptive density estimators on regions with irregular boundaries, Journal of Multivariate Analysis, 101, 949-963.

Silverman, B.W. (1986), Density Estimation for Statistics and Data Analysis, Chapman & Hall, New York.

Wand, M.P. and Jones, C.M., 1995. Kernel Smoothing, Chapman & Hall, London.

### Examples

data(chorley) # Chorley-Ribble data from package 'spatstat'

# Fixed bandwidth kernel density; uniform edge correction
chden1 <- bivariate.density(chorley,h0=1.5)

# Fixed bandwidth kernel density; diggle edge correction; coarser resolution
chden2 <- bivariate.density(chorley,h0=1.5,edge="diggle",resolution=64)

# Adaptive smoothing; uniform edge correction