# gCor: Gini Distance Covariance and Correlation Statistics In GiniDistance: A New Gini Correlation Between Quantitative and Qualitative Variables

## Description

Computes Gini distance covariance and correlation statistics, in which Xs are quantitative, Y are categorical, alpha is exponent on the Euclidean distance and returns the measures of dependence.

## Usage

 1  gCor(x, y, alpha) 

## Arguments

 x data y label of data or univariate response variable alpha exponent on Euclidean distance, in (0,2)

## Details

gCor compute Gini distance correlation statistics. It is a self-contained R function returning a measure of dependence statistics.

The sample size (number of rows) of the data must agree with the length of the label vector, and samples must not contain missing values. Arguments x, y are treated as data and labels. alpha if missing by default is 1, otherwise it is exponent on the Euclidean distance.

Suppose a sample data {\mathcal{D}} =\{(\mathbf{x}_i,y_i)\} for i = 1,...,n available. The sample counterparts can be easily computed. Let {\mathcal{I}}_k be the index set of sample points with y_i =L_k, then p_k is estimated by the sample proportion of that category, that is, \hat{p}_k= \frac{n_k}{n} where n_k is the number of elements in {\mathcal{I}}_k. With a given α \in (0,2), a point estimator of ρ_g(α) is given as follows.

\hat{Δ}_k(α)= {n_k \choose 2}^{-1} ∑_{i<j \in {\mathcal{I}}_k} \|\mathbf{x}_i -\mathbf{x}_j\| ^{α},

\hat{Δ}(α)={n \choose 2}^{-1} ∑_{1=i<j=n} \|\mathbf{x}_i -\mathbf{x}_j\| ^{α},

gCor=\hat{ρ}_g (α)= 1-\frac{∑_{k=1}^K \hat p_k \hat{Δ}_k(α)}{\hat{Δ}(α)}.

## Value

gCor returns the sample Gini distance covariacne and correlation between x and y.

## References

Dang, X., Nguyen, D., Chen, Y. and Zhang, J. (2019). A new Gini correlation between quantitative and qualitative variables. Submitted to Journal of American Statistics Association.

gmd gCov KgCov KgCor
 1 2 3  x <- iris[,1:4] y <- unclass(iris[,5]) gCor(x, y, alpha = 1)