phinp | R Documentation |
Given a q
-dimensional random vector \mathbf{X} = (\mathbf{X}_{1},...,\mathbf{X}_{k})
with \mathbf{X}_{i}
a d_{i}
-dimensional random vector, i.e., q = d_{1} + ... + d_{k}
,
this function estimates the \Phi
-dependence between \mathbf{X}_{1},...,\mathbf{X}_{k}
by estimating the joint and marginal
copula densities via fully non-parametric copula kernel density estimation.
phinp(sample, cop = NULL, dim, phi, estimator, bw_method)
sample |
A sample from a |
cop |
A fitted reference hac object, in case bw_method = 0 (default = NULL). |
dim |
The vector of dimensions |
phi |
The function |
estimator |
Either "beta" or "trans" for the beta kernel or the Gaussian transformation kernel copula density estimator. |
bw_method |
A number in |
When \mathbf{X}
has copula density c
with marginal copula densities c_{i}
of \mathbf{X}_{i}
for i = 1, \dots, k
,
the \Phi
-dependence between \mathbf{X}_{1}, \dots, \mathbf{X}_{k}
equals
\mathcal{D}_{\Phi} \left (\mathbf{X}_{1}, \dots, \mathbf{X}_{k} \right ) = \mathbb{E} \left \{ \frac{\prod_{i = 1}^{k} c_{i}(\mathbf{U}_{i})}{c \left ( \mathbf{U} \right )} \Phi \left (\frac{c(\mathbf{U})}{\prod_{i = 1}^{k}c_{i}(\mathbf{U}_{i})} \right ) \right \},
for a certain continuous, convex function \Phi : (0,\infty) \rightarrow \mathbb{R}
, and with \mathbf{U} = (\mathbf{U}_{1}, \dots, \mathbf{U}_{k}) \sim c
.
The expectation \mathbb{E}
is replaced by the empirical mean using the estimated copula sample \widehat{\mathbf{U}}^{(1)}, \dots, \widehat{\mathbf{U}}^{(n)}
with \widehat{\mathbf{U}}^{(\ell)} = (\widehat{\mathbf{U}}_{1}^{(\ell)}, \dots, \widehat{\mathbf{U}}_{k}^{(\ell)})
for \ell = 1, \dots, n
, where (recall that \mathbf{X}_{i} = (X_{i1}, \dots, X_{id_{i}})
for i = 1, \dots, k
)
\widehat{\mathbf{U}}_{i}^{(\ell)} = \left (\widehat{U}_{i1}^{(\ell)}, \dots, \widehat{U}_{id_{i}}^{(\ell)} \right ) = \left (\widehat{F}_{i1} \left (X_{i1}^{(\ell)} \right ), \dots, \widehat{F}_{id_{i}} \left (X_{id_{i}}^{(\ell)} \right ) \right ).
Hereby, \widehat{F}_{ij}(x_{ij}) = \frac{1}{n+1} \sum_{\ell = 1}^{n} 1 \left (X_{ij}^{(\ell)} \leq x_{ij} \right )
is the (rescaled) empirical cdf of X_{ij}
based on a sample X_{ij}^{(1)}, \dots, X_{ij}^{(n)}
for i = 1, \dots, k
and j = 1, \dots, d_{i}
.
The joint copula density c
and marginal copula densities c_{i}
for i = 1, \dots, k
are estimated via fully non-parametric copula kernel density estimation.
When estimator = "beta", the beta kernel copula density estimator is used.
When estimator = "trans", the Gaussian transformation kernel copula density estimator is used.
Bandwidth selection is done locally by using the function hamse
.
When bw_method = 0, then the given fitted (e.g., via MLE using mlehac
) hac object (hierarchical Archimedean copula) cop is used as reference copula.
When bw_method = 1, then a non-parametric (beta or Gaussian transformation) kernel copula density estimator based on the pseudos as pivot is used. This pivot is computed
using the big O bandwidth (i.e., n^{-2/(q+4)}
in case of the beta estimator, and n^{-1/(q+4)}
for the transformation estimator, with q
the total dimension).
When bw_method = 2, the big O bandwidths are taken.
The estimated \Phi
-dependence between \mathbf{X}_{1}, \dots, \mathbf{X}_{k}
.
De Keyser, S. & Gijbels, I. (2024). Hierarchical variable clustering via copula-based divergence measures between random vectors. International Journal of Approximate Reasoning 165:109090. doi: https://doi.org/10.1016/j.ijar.2023.109090.
betakernelestimator
for the computation of the beta kernel copula density estimator,
transformationestimator
for the computation of the Gaussian transformation kernel copula density estimator,
hamse
for local bandwidth selection for the beta kernel or Gaussian transformation kernel copula density estimator.
q = 4
dim = c(2,2)
# Sample size
n = 500
# Four dimensional hierarchical Gumbel copula
# with parameters (theta_0,theta_1,theta_2) = (2,3,4)
HAC = gethac(dim,c(2,3,4),type = 1)
# Sample
sample = suppressWarnings(HAC::rHAC(n,HAC))
# Maximum pseudo-likelihood estimator to be used as reference copula for bw_method = 0
est_cop = mlehac(sample,dim,1,c(2,3,4))
# Estimate mutual information between two random vectors of size 2 in different ways
est_phi_1 = phinp(sample,cop = est_cop,dim = dim,phi = function(t){t * log(t)},
estimator = "beta",bw_method = 0)
est_phi_2 = phinp(sample,cop = est_cop,dim = dim,phi = function(t){t * log(t)},
estimator = "trans",bw_method = 0)
est_phi_3 = phinp(sample,dim = dim,phi = function(t){t * log(t)},
estimator = "beta",bw_method = 1)
est_phi_4 = phinp(sample,dim = dim,phi = function(t){t * log(t)},
estimator = "trans",bw_method = 1)
est_phi_5 = phinp(sample,dim = dim,phi = function(t){t * log(t)},
estimator = "beta",bw_method = 2)
est_phi_6 = phinp(sample,dim = dim,phi = function(t){t * log(t)},
estimator = "trans",bw_method = 2)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.