bw_cv_polysph: Cross-validation bandwidth selection for polyspherical kernel...

View source: R/bwd.R

bw_cv_polysphR Documentation

Cross-validation bandwidth selection for polyspherical kernel density estimator

Description

Likelihood Cross-Validation (LCV) and Least Squares Cross-Validation (LSCV) bandwidth selection for the polyspherical kernel density estimator.

Usage

bw_cv_polysph(X, d, kernel = 1, kernel_type = 1, k = 10,
  intrinsic = FALSE, type = c("LCV", "LSCV")[1], M = 10000, bw0 = NULL,
  na.rm = FALSE, h_min = 0, upscale = FALSE, deriv = 0,
  imp_mc = TRUE, seed_mc = NULL, exact_vmf = FALSE, common_h = FALSE,
  spline = FALSE, opt = c("optim", "nlm")[1], ncores = 1, ...)

Arguments

X

a matrix of size c(n, sum(d) + r) with the sample.

d

vector of size r with dimensions.

kernel

kernel employed: 1 for von Mises–Fisher (default); 2 for Epanechnikov; 3 for softplus.

kernel_type

type of kernel employed: 1 for product kernel (default); 2 for spherically symmetric kernel.

k

softplus kernel parameter. Defaults to 10.0.

intrinsic

use the intrinsic distance, instead of the extrinsic-chordal distance, in the kernel? Defaults to FALSE.

type

cross-validation type, either "LCV" (default) or "LSCV".

M

Monte Carlo samples to use for approximating the integral in the LSCV loss.

bw0

initial bandwidth vector for minimizing the CV loss. If NULL, it is computed internally by magnifying the bw_rot_polysph bandwidths by 50%. Can be also a matrix of initial bandwidth vectors.

na.rm

remove NAs in the objective function? Defaults to FALSE.

h_min

minimum h enforced (componentwise). Defaults to 0.

upscale

rescale the resulting bandwidths to work for derivative estimation? Defaults to FALSE.

deriv

derivative order to perform the upscaling. Defaults to 0.

imp_mc

use importance sampling in the Monte Carlo approximation of the integral in the LSCV loss? It is more accurate but also more time consuming. Defaults to TRUE.

seed_mc

seed for the Monte Carlo simulations used to estimate the integral in the LSCV loss. Defaults to NULL (no seed is fixed for different bandwidths).

exact_vmf

use the closed-form for the LSCV loss with the von Mises–Fisher kernel? Defaults to FALSE.

common_h

use the same bandwidth for all dimensions? Defaults to FALSE.

spline

use a faster spline approximation to compute Bessel functions? Defaults to FALSE.

opt

optimizer to use; either "optim" (default) or "nlm".

ncores

number of cores used during the optimization. Defaults to 1.

...

further arguments passed to optim or nlm (if ncores = 1) or optimParallel (if ncores > 1).

Details

If bw0 is a matrix, then the optimization is started at that row of bandwidths that is most promising for the optimization, i.e., the bandwidths that minimized the CV loss.

Value

A list with entries bw (optimal bandwidth) and opt, the latter containing the output of nlm, optim, or optimParallel.

Examples

n <- 20
d <- 1:2
kappa <- rep(10, 2)
X <- r_vmf_polysph(n = n, d = d, mu = r_unif_polysph(n = 1, d = d),
                   kappa = kappa)
bw_cv_polysph(X = X, d = d, type = "LCV")$bw
bw_cv_polysph(X = X, d = d, type = "LSCV", exact_vmf = TRUE)$bw

polykde documentation built on April 16, 2025, 1:11 a.m.