kernelDeviance: Kernel Density Deviance

Description Usage Arguments Details Author(s) Examples

View source: R/distanceFunctions.R

Description

Calculates the Bayesian deviance (-2*log-likelihood) under the same kernel density model used by kernelDist() for a range of bandwidths. Can be used to estimate the optimal (maximum likelihood) bandwith to use in the kernelDist() function (see example). Data are subset prior to calculating distances (see details).

Usage

1
2
kernelDeviance(dfv, column.nums = 1:ncol(dfv), subset = 1:nrow(dfv),
  bandwidth = seq(0.1, 1, 0.1), S = NULL, reportProgress = FALSE)

Arguments

dfv

a data frame containing observations in rows and statistics in columns.

column.nums

indexes the columns of the data frame that will be used to calculate kernel log-likelihood (all other columns are ignored).

subset

index the rows of the data frame that will be used to calculate the covariance matrix (unless specified manually).

bandwidth

a vector containing the range of bandwidths to be explored.

S

the covariance matrix that the bandwidth is multiplied by. Leave as NULL to use the ordinary covariance matrix calculated using cov(dfv[subset,column.nums]).

reportProgress

whether to report current progress of the algorithm to the console (TRUE/FALSE).

Details

Uses same input and model structure as kernelDist(). Calculates the log-likelihood using the leave-one-out method, wherein the likelihood of point i is equal to its kernel density from every point j in the chosen subset, where j!=i. This avoids the issue of obtaining infinite likelihood at zero bandwidth, which would be the case under an ordinary kernel density model.

Author(s)

Robert Verity r.verity@imperial.ac.uk

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
## Not run: 
# create a data frame of observations
df <- data.frame(x=rnorm(100),y=rnorm(100))

# create a vector of bandwidths to explore
lambda <- seq(0.1,2,0.1)

# obtain deviance at each of these bandwidths
deviance <- kernelDeviance(df,bandwidth=lambda,reportProgress=TRUE)

# find the maximum-likelihood (minimum-deviance) bandwidth
lambda_ML <- lambda[which.min(deviance)]

# use this value when calculating kernel density distances
distances <- kernelDist(df,bandwidth=lambda_ML)

## End(Not run)

NESCent/MINOTAUR documentation built on May 7, 2019, 6:01 p.m.