DCV: Density cross-validation

DCVR Documentation

Density cross-validation

Description

Density cross-validation

Usage

DCV(
  x,
  bw,
  weights = NULL,
  same = FALSE,
  kernel = "gaussian",
  order = 2,
  PIT = FALSE,
  chunks = 0,
  no.dedup = FALSE
)

Arguments

x

A numeric vector, matrix, or data frame containing observations. For density, the points used to compute the density. For kernel regression, the points corresponding to explanatory variables.

bw

Candidate bandwidth values: scalar, vector, or a matrix (with columns corresponding to columns of x).

weights

A numeric vector of observation weights (typically counts) to perform weighted operations. If null, rep(1, NROW(x)) is used. In all calculations, the total number of observations is assumed to be the sum of weights.

same

Logical: use the same bandwidth for all columns of x?

Note: since DCV requires computing the leave-one-out estimator, repeated observations are combined first; the de-duplication is therefore forced in cross-validation. The only situation where de-duplication can be skipped is passing de-duplicated data sets from outside (e.g. inside optimisers).

kernel

Character describing the desired kernel type. NB: due to limited machine precision, even Gaussian has finite support.

order

An integer: 2, 4, or 6. Order-2 kernels are the standard kernels that are positive everywhere. Orders 4 and 6 produce some negative values, which reduces bias but may hamper density estimation.

PIT

If TRUE, the Probability Integral Transform (PIT) is applied to all columns of x via ecdf in order to map all values into the [0, 1] range. May be an integer vector of indices of columns to which the PIT should be applied.

chunks

Integer: the number of chunks to split the task into (limits RAM usage but increases overhead). 0 = auto-select (making sure that no matrix has more than 2^27 elements).

no.dedup

Logical: if TRUE, sets deduplicate.x and deduplicate.xout to FALSE (shorthand).

Value

A numeric vector of the same length as bw or nrow(bw).

Examples

set.seed(1)
x <- rlnorm(100); x <- c(x[1], x)  # x with 1 duplicate
bws <- exp(seq(-3, 0.5, 0.1))
plot(bws, DCV(x, bws), log = "x", bty = "n", main = "Density CV")

smoothemplik documentation built on Aug. 22, 2025, 1:11 a.m.