A new Gini correlation to measure dependence between categorical and numerical variables are implemented. Analogous to Pearson in ANOVA model, the Gini correlation is interpreted as the ratio of the between-group variation and the total variation, but it characterizes independence (zero Gini correlation mutually implies independence). Closely related to the distance correlation, the Gini correlation is of the simple formulation by considering the nature of the categorical variable. As a result, the Gini correlation has a lower computational cost than the distance correlation and is more straightforward to perform inference. The dependence test and confidence interval are implemented. Also, the corresponding kernelized dependence measures are also implemented.
The details are described in the following papers "A new Gini correlation between quantitative and qualitative variables" and "Estimating Feature-Label Dependence Using Gini Distance Statistics"
Dang, X., Nguyen, D., Chen, Y. and Zhang, J., (2019). A new Gini correlation between quantitative and qualitative variables, Journal of the American Statistical Association (submitted), https://arxiv.org/pdf/1809.09793.pdf
Zhang, S., Dang, X., Nguyen, D. and Chen, Y. (2019). Estimating feature - label dependence using Gini distance statistics. IEEE Transactions on Pattern Analysis and Machine Intelligence (submitted), https://arXiv.org/pdf/1906.02171.pdf
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.