gCor | R Documentation |
Computes Gini distance covariance and correlation statistics, in which Xs are quantitative, Y are categorical, alpha is exponent on the Euclidean distance and returns the measures of dependence.
gCor(x, y, alpha)
x |
data |
y |
label of data or univariate response variable |
alpha |
exponent on Euclidean distance, in (0,2) |
gCor
compute Gini distance correlation statistics.
It is a self-contained R function returning a measure of dependence statistics.
The sample size (number of rows) of the data must agree with the length of the label vector, and samples must not contain missing values. Arguments
x
, y
are treated as data and labels. alpha
if missing by default is 1, otherwise it is exponent on the Euclidean distance.
Suppose a sample data {\mathcal{D}} =\{(\mathbf{x}_i,y_i)\} for i = 1,...,n available. The sample counterparts can be easily computed. Let {\mathcal{I}}_k be the index set of sample points with y_i =L_k, then p_k is estimated by the sample proportion of that category, that is, \hat{p}_k= \frac{n_k}{n} where n_k is the number of elements in {\mathcal{I}}_k. With a given α \in (0,2), a point estimator of ρ_g(α) is given as follows.
\hat{Δ}_k(α)= {n_k \choose 2}^{-1} ∑_{i<j \in {\mathcal{I}}_k} \|\mathbf{x}_i -\mathbf{x}_j\| ^{α},
\hat{Δ}(α)={n \choose 2}^{-1} ∑_{1=i<j=n} \|\mathbf{x}_i -\mathbf{x}_j\| ^{α},
gCor=\hat{ρ}_g (α)= 1-\frac{∑_{k=1}^K \hat p_k \hat{Δ}_k(α)}{\hat{Δ}(α)}.
gCor
returns the sample Gini distance covariacne and correlation between x
and y
.
Dang, X., Nguyen, D., Chen, Y. and Zhang, J. (2019). A new Gini correlation between quantitative and qualitative variables. Submitted to Journal of American Statistics Association.
gmd
gCov
KgCov
KgCor
x <- iris[,1:4] y <- unclass(iris[,5]) gCor(x, y, alpha = 1)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.