Description Usage Arguments Details Value References See Also Examples

Computes Gini distance covariance statistics, in which Xs are quantitative, Y are categorical, alpha is an exponent on Euclidean distance and returns the measures of dependence.

1 | ```
gCov(x, y, alpha)
``` |

`x` |
data |

`y` |
label of data or univariate response variable |

`alpha` |
exponent on Euclidean distance, in (0,2] |

`gCov`

compute Gini distance covariance statistics.
It is a self-contained R function returning a measure of dependence statistics.

The sample size (number of rows) of the data must agree with the length of the label vector, and samples must not contain missing values. Arguments
`x`

, `y`

are treated as data and labels. `alpha`

if missing by default is 1, otherwise it is exponent on the Euclidean distance.

Gini distance covariance is a new measure of dependence between random vectors and its labels. For all distributions with finite first moments, Gini distance correlation gCov has the following fundamental properties:

(1) gCov(X,Y) is defined for *X* in arbitrary dimension quantitive variable and *Y* a univariate categorical variable.

(2) gCov(X,Y)=0 characterizes independence of *X* and
*Y*.

Gini distance covariance satisfies *0 ≤ gCov(X,Y)*, and
*gCov = 0* only if *X* and *Y* are independent. Gini distance
covariance gCov provides a new approach to the problem of
testing the joint independence of random vectors. The formal
definitions of the population coefficients gCov is given in (DNCZ 2018). The empirical Gini distance covariance *gCov_n(X,Y; alpha)* is the nonnegative number computed as follows.

Suppose a sample data * {\mathcal{D}} =\{(\mathbf{x}_i,y_i)\} * for *i = 1,...,n* available. The sample counterparts can be easily computed. Let *{\mathcal{I}}_k * be the index set of sample points with *y_i =L_k*, then *p_k* is estimated by the sample proportion of that category, that is, *\hat{p}_k= \frac{n_k}{n}* where *n_k* is the number of elements in *{\mathcal{I}}_k*. With a given *α \in (0,2)*, a point estimator of *ρ_g(α)* is given as follows.

*\hat{Δ}_k(α)= {n_k \choose 2}^{-1} ∑_{i<j \in {\mathcal{I}}_k} \|\mathbf{x}_i -\mathbf{x}_j\| ^{α},*

*\hat{Δ}(α)={n \choose 2}^{-1} ∑_{1=i<j=n} \|\mathbf{x}_i -\mathbf{x}_j\| ^{α},*

*{gCov}= \hat{Δ}(α)-∑_{k=1}^K \hat p_k \hat{Δ}_k(α).*

`gCov`

returns the sample Gini distance covariance

Dang, X., Nguyen, D., Chen, Y. and Zhang, J., (2019). A new Gini correlation between quantitative and qualitative variables,
*Journal of the American Statistical Association (submitted)*,
https://arxiv.org/pdf/1809.09793.pdf

1 2 3 |

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.