EllDistrEst.adapt: Estimation of the generator of the elliptical distribution by...
In ElliptCopulas: Inference of Elliptical Distributions and Copulas

EllDistrEst.adapt

R Documentation

Estimation of the generator of the elliptical distribution by kernel smoothing with adaptive choice of the bandwidth

Description

A continuous elliptical distribution has a density of the form

f_X(x) = {|\Sigma|}^{-1/2} g\left( (x-\mu)^\top \, \Sigma^{-1} \, (x-\mu) \right),

where x \in \mathbb{R}^d, \mu \in \mathbb{R}^d is the mean, \Sigma is a d \times d positive-definite matrix and a function g: \mathbb{R}_+ \rightarrow \mathbb{R}_+, called the density generator of X. The goal is to estimate g at some point \xi, by

\widehat{g}_{n,h,a}(\xi) := \dfrac{\xi^{\frac{-d+2}{2}} \psi_a'(\xi)}{n h s_d} \sum_{i=1}^n K\left( \dfrac{ \psi_a(\xi) - \psi_a(\xi_i) }{h} \right) + K\left( \dfrac{ \psi_a(\xi) + \psi_a(\xi_i) }{h} \right),

where s_d := \pi^{d/2} / \Gamma(d/2), \Gamma is the Gamma function, h and a are tuning parameters (respectively the bandwidth and a parameter controlling the bias at \xi = 0), \psi_a(\xi) := -a + (a^{d/2} + \xi^{d/2})^{2/d}, \xi \in \mathbb{R}, K is a kernel function and \xi_i := (X_i - \mu)^\top \, \Sigma^{-1} \, (X_i - \mu), for a sample X_1, \dots, X_n. This function computes "optimal asymptotic" values for the bandwidth h and the tuning parameter a from a first step bandwidth that the user needs to provide.

Usage

EllDistrEst.adapt(
  X,
  mu = 0,
  Sigma_m1 = diag(NCOL(X)),
  grid,
  h_firstStep,
  grid_a = NULL,
  Kernel = "gaussian",
  mpfr = FALSE,
  precBits = 100,
  dopb = TRUE
)

Arguments

`X`	a matrix of size `n \times d`, assumed to be `n` i.i.d. observations (rows) of a `d`-dimensional elliptical distribution.
`mu`	mean of X. This can be the true value or an estimate. It must be a vector of dimension `d`.
`Sigma_m1`	inverse of the covariance matrix of X. This can be the true value or an estimate. It must be a matrix of dimension `d \times d`.
`grid`	vector containing the values at which we want the generator to be estimated.
`h_firstStep`	a vector of size `2` containing first-step bandwidths to be used. The first one is used for the estimation of the asymptotic mean-squared error. The second one is used for the first step estimation of `g`. From these two estimators, a final value of the bandwidth `h` is determined, which is used for the final estimator of `g`. If `h_firstStep` is of length `1`, its value is reused for both purposes (estimation of the AMSE and first-step estimation of `g`).
`grid_a`	the grid of possible values of `a` to be used. If missing, a default sequence is used.
`Kernel`	name of the kernel. Possible choices are `"gaussian"`, `"epanechnikov"`, `"triangular"`.
`mpfr`	if `mpfr = TRUE`, multiple precision floating point is used via the package Rmpfr. This allows for a higher (numerical) accuracy, at the expense of computing time. It is recommended to use this option for higher dimensions.
`precBits`	number of precBits used for floating point precision (only used if `mpfr = TRUE`).
`dopb`	a Boolean value. If `dopb = TRUE`, a progress bar is displayed.

Value

a list with the following elements:

g a vector of size n1 = length(grid). Each component of this vector is an estimator of g(x[i]) where x[i] is the i-th element of the grid.
best_a a vector of the same size as grid indicating for each value of the grid what is the optimal choice of a found by our algorithm (which is used to estimate g).
best_h a vector of the same size as grid indicating for each value of the grid what is the optimal choice of h found by our algorithm (which is used to estimate g).
first_step_g first step estimator of g, computed using the tuning parameters best_a and h_firstStep[2].
AMSE_estimated an estimator of the part of the asymptotic MSE that only depends on a.

Author(s)

Alexis Derumigny, Victor Ryan

References

Ryan, V., & Derumigny, A. (2024). On the choice of the two tuning parameters for nonparametric estimation of an elliptical distribution generator arxiv:2408.17087.

Examples

n = 500
d = 3
X = matrix(rnorm(n * d), ncol = d)
grid = seq(0, 5, by = 0.1)

result = EllDistrEst.adapt(X = X, grid = grid, h = 0.05)
plot(grid, result$g, type = "l")
lines(grid, result$first_step_g, col = "blue")

# Computation of true values
g = exp(-grid/2)/(2*pi)^{3/2}
lines(grid, g, type = "l", col = "red")

plot(grid, result$best_a, type = "l", col = "red")
plot(grid, result$best_h, type = "l", col = "red")

sum((g - result$g)^2, na.rm = TRUE) < sum((g - result$first_step_g)^2, na.rm = TRUE)

ElliptCopulas documentation built on Sept. 11, 2024, 6:50 p.m.