gmd: Gini Mean Difference

Description Usage Arguments Details Value References See Also Examples

Description

Computes Gini mean difference of x, where alpha is an exponent on the Euclidean distance and return the Gini mean difference. The default value for alpha is 1.

Usage

1
  gmd(x, alpha)

Arguments

x

data

alpha

exponent on Euclidean distance, in (0,2)

Details

gmd compute Gini mean difference of data. It is a self-contained R function dealing with both univariate and multivariate data.

The samples must not contain missing values. alpha if missing by default is 1, otherwise it is exponent on the Euclidean distance.

Gini mean difference (GMD) was originally introduced as an alternative measure of variability to the usual standard deviation (Gini14, Yitzhaki13). Let X and X^\prime be independent random variables from a univariate distribution F with finite first moment in R. The GMD of F is

Δ=Δ(X)=Δ(F)=E|X-X^{\prime}|,

the expected distance between two independent random variables. If the sample data \mathbf x=\{x_1,x_2,...,x_n\} is available, the sample Gini mean difference is calculated by

\hat{Δ} = {n \choose 2}^{-1} ∑_{1≤q i<j≤q n} | x_i - x_j| = {n \choose 2}^{-1} ∑_{i=1}^n (2i-n-1) x_{(i)},

where x_{(1)} ≤q x_{(2)} ≤q \cdots ≤q x_{(n)} are the order statistics of \mathbf x (Schechtman87). The computation complexity for univariate Gini Mean difference is O(n \log n).

Gini mean difference has been generalized for multivariate distributions (Koshvoy97) That is, the Gini mean difference of a distribution F in \mathbf{R}^d is Δ =E \|\mathbf X -\mathbf X ^\prime\|, or even more generally for some α \in (0,2),

Δ(α) = E \|\mathbf X-\mathbf X^\prime\|^{α},

where \| \mathbf x \| is the Euclidean norm. The sample Gini mean difference is computed by

\hat{Δ(α)} = {n \choose 2}^{-1} ∑_{1≤q i<j≤q n} \| x_i - x_j\|^{α}.

Its computation complexity is O(n^2).

Value

gmd returns the sample Gini mean distance.

References

Gini, C. (1914). Sulla misura della concentrazione e della variabilita dei caratteri. Atti del Reale Istituto Veneto di Scienze, Lettere ed Aeti, 62, 1203-1248. English Translation: On the measurement of concentration and variability of characters (2005). Metron, LXIII(1), 3-38.

Koshevoy, G. and Mosler, K. (1997). Multivariate Gini indices. Journal of Multivariate Analysis, 60, 252-276.

Schechtman, E. and Yitzhaki, S. (1987). A measure of association based on Gini's mean difference. Communication in Statistics-Theory and Methods, 16 (1), 207-231.

Yitzhaki, S. and Schechtman, E. (2013). The Gini Methodology, Springer, New York.

See Also

RcppGmd gCov gCor

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
  n = 100
  x <- runif(n)
  
  t0 = proc.time()
  gmd(x, alpha=1)
  proc.time()- t0
  
  t1 = proc.time()
  gmd(x, alpha=0.5)    
  proc.time()- t1
  
  x <- matrix(runif(n), n/2, 2)
  gmd(x,alpha=1)
  

GiniDistance documentation built on June 28, 2019, 5:03 p.m.