bw.CV: Bandwidth Selectors for Kernel Density Estimation

bw.CVR Documentation

Bandwidth Selectors for Kernel Density Estimation

Description

Finds the optimal bandwidth by minimising the density cross-valication or least-squares criteria. Remember that since usually, the CV function is highly non-linear, the return value should be taken with a grain of salt. With non-smooth kernels (such as uniform), it will oftern return the local minimum after starting from a reasonable value. The user might want to standardise the input matrix x by column (divide by some estimator of scale, like sd or IQR) and examine the behaviour of the CV criterion as a function of unique bandwidth (same argument). If it seems that the optimum is unique, then they may proceed by multiplying the bandwidth by the scale measure, and start the search for the optimal bandwidth in multiple dimensions.

Usage

bw.CV(
  x,
  y = NULL,
  weights = NULL,
  kernel = "gaussian",
  order = 2,
  PIT = FALSE,
  chunks = 0,
  robust.iterations = 0,
  degree = 0,
  start.bw = NULL,
  same = FALSE,
  tol = 1e-04,
  try.grid = TRUE,
  ndeps = 1e-05,
  verbose = FALSE,
  attach.attributes = FALSE,
  control = list()
)

Arguments

x

A numeric vector, matrix, or data frame containing observations. For density, the points used to compute the density. For kernel regression, the points corresponding to explanatory variables.

y

A numeric vector of responses (dependent variable) if the user wants least-squares cross-validation.

weights

A numeric vector of observation weights (typically counts) to perform weighted operations. If null, rep(1, NROW(x)) is used. In all calculations, the total number of observations is assumed to be the sum of weights.

kernel

Character describing the desired kernel type. NB: due to limited machine precision, even Gaussian has finite support.

order

An integer: 2, 4, or 6. Order-2 kernels are the standard kernels that are positive everywhere. Orders 4 and 6 produce some negative values, which reduces bias but may hamper density estimation.

PIT

If TRUE, the Probability Integral Transform (PIT) is applied to all columns of x via ecdf in order to map all values into the [0, 1] range. May be an integer vector of indices of columns to which the PIT should be applied.

chunks

Integer: the number of chunks to split the task into (limits RAM usage but increases overhead). 0 = auto-select (making sure that no matrix has more than 2^27 elements).

robust.iterations

Passed to kernelSmooth if y is not NULL (for least-squares CV).

degree

Passed to kernelSmooth if y is not NULL (for least-squares CV).

start.bw

Numeric vector: initial value for bandwidth search.

same

Logical: use the same bandwidth for all columns of x?

tol

Relative tolerance used by the optimiser as the stopping criterion.

try.grid

Logical: if true, 10 different bandwidths around the rule-of-thumb one are tried with multiplier 1.2^(-3:6)

ndeps

Numerical-difference epsilon. Puts a lower bound on the result: the estimated optimal bw cannot be less than this value.

verbose

Logical: print out the optimiser return code for diagnostics?

attach.attributes

Logical: if TRUE, returns the output of 'optim()' for diagnostics.

control

List: extra arguments to pass to the control-argument list of 'optim'.

Details

If y is NULL and only x is supplied, returns the density-cross-validated bandwidth (DCV). If y is supplied, then, returns the least-squares-cross-validated bandwidth (LSCV).

Value

Numeric vector or scalar of the optimal bandwidth.

Examples

set.seed(1)  # Creating a data set with many duplicates
n.uniq <- 200
n <- 500
inds <- sort(ceiling(runif(n, 0, n.uniq)))
x.uniq <- sort(rnorm(n.uniq))
y.uniq <- 1 + 0.1*x.uniq + sin(x.uniq) + rnorm(n.uniq)
x <- x.uniq[inds]
y <- y.uniq[inds]
w <- 1 + runif(n, 0, 2) # Relative importance
data.table::setDTthreads(1) # For measuring the pure gains and overhead
RcppParallel::setThreadOptions(numThreads = 1)
bw.grid <- seq(0.1, 1.3, 0.2)
CV <- LSCV(x, y, bw.grid, weights = w)
bw.init <- bw.grid[which.min(CV)]
bw.opt <- bw.CV(x, y, w) # 0.49, very close
g <- seq(-3.5, 3.5, 0.05)
yhat <- kernelSmooth(x, y, g, w, bw.opt, deduplicate.xout = FALSE)
oldpar <- par(mfrow = c(2, 1), mar = c(2, 2, 2, 0)+.1)
plot(bw.grid, CV, bty = "n", xlab = "", ylab = "", main = "Cross-validation")
points(bw.opt, LSCV(x, y, bw.opt, w), col = 2, pch = 15)
plot(x.uniq, y.uniq, bty = "n", xlab = "", ylab = "", main = "Optimal fit")
points(g, yhat, pch = 16, col = 2, cex = 0.5)
par(oldpar)

smoothemplik documentation built on Aug. 22, 2025, 1:11 a.m.