igini: Gini index, variances and confidence intervals in infinite...

View source: R/igini.R

iginiR Documentation

Gini index, variances and confidence intervals in infinite populations

Description

Estimation of the Gini index and computation of variances and confidence interval for infinite populations.

Usage

igini(
  y,
  bias.correction = TRUE,
  interval = NULL,
  B = 1000L,
  alpha = 0.05,
  cum.sums = NULL,
  na.rm = TRUE,
  precisionEL = 1e-04,
  maxiterEL = 100L,
  large.sample = FALSE
)

Arguments

y

A vector with the non-negative real numbers to be used for estimating the Gini index. This argument can be missing if argument cum.sums is provided.

bias.correction

A 'TRUE/FALSE' logical value indicating whether the bias correction should be applied to the estimation of the Gini index. The default value is bias.correction = TRUE.

interval

A character string specifying the type of variance estimation and confidence interval to be used, or NULL (the default value) to omit the computation of both variance and confidence interval. Possible values are "zjackknife", "tjackknife", "zalinearization", "zblinearization", "talinearization", "tblinearization", "pbootstrap", "BCa", "ELchisq" and "ELboot". The default value is interval = NULL.

B

A single integer specifying the number of bootstrap replicates. The default value is B = 1000L.

alpha

A single numeric value between 0 and 1. If interval is not NULL, the confidence level to be used for computing the confidence interval for the Gini is 1-alpha. Some authors call alpha the significance level. The default value is alpha = 0.05.

cum.sums

A vector with the non-negative real numbers specifying the cumulative sums of the variable used to estimate the Gini index. This argument can be NULL if argument y is provided. The default value is cum.sums = NULL.

na.rm

A 'TRUE/FALSE' logical value indicating whether NA's should be removed before the computation proceeds. The default value is na.rm = TRUE.

precisionEL

A single numeric value specifying the precision for the confidence interval based on the empirical likelihood method. The default value is precisionEL = 1e-4, i.e., limits of the confidence interval have a total of 4 decimal places.

maxiterEL

A single integer specifying the maximal number of iterations allowed for the convergene of the empirical likelihood method. The default value is maxiterEL = 100L.

large.sample

A 'TRUE/FALSE' logical value indicating whether the sample is large to apply a faster algorithm to sort the sample values. The default value is large.sample = FALSE.

Details

For a sample S, with size n, derived from an infinite population, the Gini index is estimated by

\widehat{G} = \displaystyle \frac{2}{\overline{y}n^{2}}\sum_{i \in S}iy_{(i)} - \frac{n+1}{n}

when bias.correction = FALSE, and by

\widehat{G}^{bc} = \displaystyle \frac{2}{\overline{y}n(n-1)}\sum_{i \in S}iy_{(i)} - \frac{n+1}{n-1}

when bias.correction = TRUE. For more details, see Muñoz et al. (2023). The table below sumarises the various types of variances and confidence intervals that computes this function. Methods based on the jackknife technique use the fast algorithm suggested by Ogwang (2000). The linearization technique for variance estimation (Deville, 1999) has been applied to the following estimators of the Gini index (Berger, 2008; Langel and Tille, 2013):

\widehat{G}^{a} = \displaystyle \frac{1}{2\overline{y}n^{2}}\sum_{i \in S}\sum_{j\in S} |y_i-y_j|

and

\widehat{G}^{b} = \displaystyle \frac{2}{\overline{y}n}\sum_{i \in S}y_{i}\widehat{F}_{n}(y_{i}) - 1,

where

\widehat{F}_{n}(y_i)=\frac{1}{n}\sum_{j \in S}\delta(y_j \leq y_i).

zalinearization and zblinearization linearizate, respectively, the estimators \widehat{G}^{a} and \widehat{G}^{b}. The percentile bootstrap (see Qin et al., 2010) is computed using pbootstrap. Bca is the bias corrected bootstrap confidence interval (Efron and Tibshirani, 1993). ELchisq and ELboot are the confidence intervals based on the empirical likelihood method. The vignette vignette("GiniVarInterval") contains a detailed description of the various methods for variance estimation and confidence intervals for the Gini index.

Interval Variance Critical values References
_______________ ____________ __________________ __________________________
zjackknife Jackknife Normal Berger (2008)
tjackknife Jackknife Studentized bootstrap Biewen (2002); Berger (2008)
zalinearization Linearization Normal Langel and Tille (2013)
zblinearization Linearization Normal Berger (2008)
talinearization Linearization Studentized bootstrap Langel and Tille (2013)
tblinearization Linearization Studentized bootstrap Biewen (2002); Berger (2008)
pBootstrap Bootstrap Percentile bootstrap Qin et al. (2010)
BCa Bootstrap BCa bootstrap Davison and Hinkley (1997)
ELchisq Linearization Chi-Squared Qin et al. (2010)
ELboot Bootstrap Percentile bootstrap Qin et al. (2010)

Value

When interval = NULL, a single numeric value between 0 and 1, containing the estimation of the Gini index based on the vector y or the vector cum.sums. When interval is not NULL, a list of 3 components: a single numeric value with the estimation of the Gini index; a single numeric value with the variance estimation of the Gini index; and a numeric matrix with 1 row and 2 columns containing the lower and upper limits of the confidence intervals for the Gini index.

Author(s)

Juan F Munoz jfmunoz@ugr.es

Jose M Pavia pavia@uv.es

Encarnacion Alvarez encarniav@ugr.es

References

Berger, Y. G. (2008). A note on the asymptotic equivalence of jackknife and linearization variance estimation for the Gini Coefficient. Journal of Official Statistics, 24(4), 541-555.

Biewen, M. (2002). Bootstrap inference for inequality, mobility and poverty measurement. Journal of Econometrics, 108(2), 317-342.

Davison, A. C., and Hinkley, D. V. (1997). Bootstrap Methods and Their Application (Cambridge Series in Statistical and Probabilistic Mathematics, No 1)–Cambridge University Press.

Deville, J.C. (1999). Variance Estimation for Complex Statistics and Estimators: Linearization and Residual Techniques. Survey Methodology, 25, 193–203.

Efron, B. and Tibshirani, R. (1993). An Introduction to the Bootstrap. Chapman and Hall, New York, London.

Langel, M., and Tille, Y. (2013). Variance estimation of the Gini index: revisiting a result several times published. Journal of the Royal Statistical Society: Series A (Statistics in Society), 176(2), 521-540.

Muñoz, J. F., Moya-Fernández, P. J., and Álvarez-Verdejo, E. (2023). Exploring and Correcting the Bias in the Estimation of the Gini Measure of Inequality. Sociological Methods & Research. https://doi.org/10.1177/00491241231176847

Ogwang, T. (2000). A convenient method of computing the Gini index and its standard error. Oxford Bulletin of Economics and Statistics, 62(1), 123-123.

Qin, Y., Rao, J. N. K., and Wu, C. (2010). Empirical likelihood confidence intervals for the Gini measure of income inequality. Economic Modelling, 27(6), 1429-1435.

See Also

icompareCI, iginindex

Examples

# Sample, with size 50, from a Lognormal distribution. The true Gini index is 0.5.
set.seed(123)
y <- gsample(n = 50, gini = 0.5, distribution = "lognormal")

# Bias corrected estimation of the Gini index.
igini(y)

# Estimation of the Gini index and confidence interval based on jackknife and studentized bootstrap.
igini(y, interval = "tjackknife")



giniVarCI documentation built on May 29, 2024, 3:36 a.m.