phiellip: phiellip
In VecDep: Measuring Copula-Based Dependence Between Random Vectors

phiellip

R Documentation

phiellip

Description

Given a q-dimensional random vector \mathbf{X} = (\mathbf{X}_{1},...,\mathbf{X}_{k}) with \mathbf{X}_{i} a d_{i}-dimensional random vector, i.e., q = d_{1} + ... + d_{k}, this function estimates the \Phi-dependence between \mathbf{X}_{1},...,\mathbf{X}_{k} by estimating the joint and marginal meta-elliptical copula generators via the MECIP.

Usage

phiellip(sample, dim, phi, grid, params, normalize = 1)

Arguments

`sample`	A sample from a `q`-dimensional random vector `\mathbf{X}` (`n \times q` matrix with observations in rows, variables in columns).
`dim`	The vector of dimensions `(d_{1},...,d_{k})`.
`phi`	The function `\Phi`.
`grid`	The grid of values on which to estimate the density generators.
`params`	The tuning parameters to be used when estimating the density generators.
`normalize`	A value in `\{1,2\}` indicating the normalization procedure that is applied to the estimated generator (default = 1).

Details

When \mathbf{X} = (\mathbf{X}_{1}, \dots, \mathbf{X}_{k}) has a meta-elliptical copula with generator g_{\mathcal{R}}, marginal generators g_{\mathcal{R}_{i}} of \mathbf{X}_{i} for i = 1, \dots, k, and scale matrix

\mathbf{R} = \begin{pmatrix} \mathbf{R}_{11} & \mathbf{R}_{12} & \cdots & \mathbf{R}_{1k} \\ \mathbf{R}_{12}^{\text{T}} & \mathbf{R}_{22} & \cdots & \mathbf{R}_{2k} \\ \vdots & \vdots & \ddots & \vdots \\ \mathbf{R}_{1k}^{\text{T}} & \mathbf{R}_{2k}^{\text{T}} & \cdots & \mathbf{R}_{kk} \end{pmatrix},

the \Phi-dependence between \mathbf{X}_{1}, \dots, \mathbf{X}_{k} equals

\mathcal{D}_{\Phi}\left (\mathbf{X}_{1}, \dots, \mathbf{X}_{k} \right ) = \mathbb{E} \left \{\frac{\prod_{i = 1}^{k} g_{\mathcal{R}_{i}}\left (\mathbf{Z}_{i}^{\text{T}} \mathbf{R}_{ii}^{-1} \mathbf{Z}_{i} \right ) \left | \mathbf{R} \right |^{1/2}}{g_{\mathcal{R}}\left (\mathbf{Z}^{\text{T}} \mathbf{R}^{-1} \mathbf{Z} \right ) \prod_{i = 1}^{k} \left | \mathbf{R}_{ii} \right |^{1/2}} \Phi \left (\frac{g_{\mathcal{R}} \left (\mathbf{Z}^{\text{T}} \mathbf{R}^{-1} \mathbf{Z} \right ) \prod_{i = 1}^{k} \left |\mathbf{R}_{ii} \right |^{1/2}}{\prod_{i = 1}^{k} g_{\mathcal{R}_{i}} \left (\mathbf{Z}_{i}^{\text{T}}\mathbf{R}_{ii}^{-1} \mathbf{Z}_{i} \right ) \left |\mathbf{R} \right |^{1/2} } \right )\right \},

where (recall that \mathbf{X}_{i} = (X_{i1}, \dots, X_{id_{i}}) for i = 1, \dots, k)

\mathbf{Z}_{i} = (Z_{i1}, \dots, Z_{id_{i}}) = \left(\left (Q \circ F_{i1} \right ) \left (X_{i1} \right ), \dots, \left (Q \circ F_{id_{i}} \right ) \left (X_{id_{i}} \right ) \right ),

and \mathbf{Z} = (\mathbf{Z}_{1}, \dots, \mathbf{Z}_{k}), with Q the quantile function corresponding to g_{\mathcal{R}}.

The expectation \mathbb{E} is replaced by the empirical mean using the estimated sample \widehat{\mathbf{Z}}^{(1)}, \dots, \widehat{\mathbf{Z}}^{(n)} with \widehat{\mathbf{Z}}^{(\ell)} = (\widehat{\mathbf{Z}}_{1}^{(\ell)}, \dots, \widehat{\mathbf{Z}}_{k}^{(\ell)}) for \ell = 1, \dots, n, where

\widehat{\mathbf{Z}}_{i}^{(\ell)} = \left (\widehat{Z}_{i1}^{(\ell)}, \dots, \widehat{Z}_{id_{i}}^{(\ell)} \right ) = \left ( \left (\widehat{Q} \circ \widehat{F}_{i1} \right ) \left (X_{i1}^{(\ell)} \right ), \dots, \left (\widehat{Q} \circ \widehat{F}_{id_{i}} \right ) \left (X_{id_{i}}^{(\ell)} \right ) \right ),

for i = 1, \dots, k. Here, \widehat{Q} will be the quantile function corresponding to the final estimator for g_{\mathcal{R}}, and

\widehat{F}_{ij}(x_{ij}) = \frac{1}{n+1} \sum_{\ell = 1}^{n} 1 \left (X_{ij}^{(\ell)} \leq x_{ij} \right )

is the (rescaled) empirical cdf of X_{ij} based on a sample X_{ij}^{(1)}, \dots, X_{ij}^{(n)} for i = 1, \dots, k and j = 1, \dots, d_{i}.

The estimation of \mathbf{R} is done via its relation with the Kendall's tau matrix, see the function “KTMatrixEst.R” in the R package ‘ElliptCopulas’ of Derumigny et al. (2024).

For estimating g_{\mathcal{R}} and g_{\mathcal{R}_{i}} for i = 1, \dots, k, the function ellcopest is used. This function requires certain tuning parameters (a bandwidth h, a parameter a, and a parameter \delta for the shrinkage function). Suppose that there are m marginal random vectors (among \mathbf{X}_{1}, \dots, \mathbf{X}_{k}) that are of dimension strictly larger than one. Then, all tuning parameters should be given as

\text{params} = \text{list}(\text{"h"} = (h,h_{1},\dots,h_{m}), \text{"a"} = (a,a_{1}, \dots, a_{m}), \text{"p"} = (\delta, \delta_{1}, \dots, \delta_{m})),

i.e., (h,a,\delta) will be used for estimating g_{\mathcal{R}}, and (h_{i},a_{i},\delta_{i}) will be used for estimating g_{\mathcal{R}_{i}} for i = 1, \dots, k.

When d_{i} = 1 for a certain i \in \{1, \dots, k \}, the function “Convert_gd_To_g1.R” from the R package ‘ElliptCopulas’ is used to estimate g_{\mathcal{R}_{i}}.

In order to make g_{\mathcal{R}} identifiable, an extra normalization procedure is implemented in line with an extra constraint on g_{\mathcal{R}}. When normalize = 1, this corresponds to \mathbf{R} being the correlation matrix of \mathbf{Z}. When normalize = 2, this corresponds to the identifiability condition of Derumigny & Fermanian (2022).

Value

The estimated \Phi-dependence between \mathbf{X}_{1}, \dots, \mathbf{X}_{k}.

References

Derumigny, A., Fermanian, J.-D., Ryan, V., van der Spek, R. (2024). ElliptCopulas, R package version 0.1.4.1. url: https://CRAN.R-project.org/package=ElliptCopulas.

De Keyser, S. & Gijbels, I. (2024). Hierarchical variable clustering via copula-based divergence measures between random vectors. International Journal of Approximate Reasoning 165:109090. doi: https://doi.org/10.1016/j.ijar.2023.109090.

Examples


q = 4
dim = c(2,2)

# Sample size
n = 1000

# Grid on which to evaluate the elliptical generator
grid = seq(0.005,100,by = 0.005)

# Degrees of freedom
nu = 7

# Student-t generator with 7 degrees of freedom
g_q = ((nu/(nu-2))^(q/2))*(gamma((q+nu)/2)/(((pi*nu)^(q/2))*gamma(nu/2))) *
                          ((1+(grid/(nu-2)))^(-(q+nu)/2))

# Density of squared radius
R2 = function(t,q){(gamma((q+nu)/2)/(((nu-2)^(q/2))*gamma(nu/2)*gamma(q/2))) *
                   (t^((q/2)-1)) * ((1+(t/(nu-2)))^(-(q+nu)/2))}

# Sample from 4-dimensional Student-t distribution with 7 degrees of freedom
# and identity covariance matrix
sample = ElliptCopulas::EllDistrSim(n,q,diag(q),density_R2 = function(t){R2(t,q)})

# Tuning parameter selection for g_R
opt_parameters_joint = elliptselect(n,q,seq((3/4)-(1/q)+0.01,1-0.01,len = 200),
                                        seq(0.01,2,len = 200))

# Optimal tuning parameters for g_R
a = opt_parameters_joint$Opta ; p = opt_parameters_joint$Optp ;
                                h = opt_parameters_joint$Opth

# Tuning parameter selection for g_R_1 (same for g_R_2)
opt_parameters_marg = elliptselect(n,2,seq((3/4)-(1/2)+0.01,1-0.01,len = 200),
                                       seq(0.01,2,len = 200))

# Optimal tuning parameters for g_R_1 (same for g_R_2)
a1 = opt_parameters_marg$Opta ; p1 = opt_parameters_marg$Optp ;
                                h1 = opt_parameters_marg$Opth

a2 = a1 ; p2 = p1 ; h2 = h1
params = list("h" = c(h,h1,h2), "a" = c(a,a1,a2), "p" = c(p,p1,p2))

# Mutual information between two random vectors of size 2
est_phi = phiellip(sample, dim, function(t){t * log(t)}, grid, params)

VecDep documentation built on April 4, 2025, 5:14 a.m.