igini: Gini index, variances and confidence intervals in infinite...
In giniVarCI: Gini Indices, Variances and Confidence Intervals for Finite and Infinite Populations

igini

R Documentation

Gini index, variances and confidence intervals in infinite populations

Description

Estimation of the Gini index and computation of variances and confidence interval for infinite populations.

Usage

igini(
  y,
  bias.correction = TRUE,
  interval = NULL,
  B = 1000L,
  alpha = 0.05,
  cum.sums = NULL,
  na.rm = TRUE,
  precisionEL = 1e-04,
  maxiterEL = 100L,
  large.sample = FALSE
)

Arguments

`y`	A vector with the non-negative real numbers to be used for estimating the Gini index. This argument can be missing if argument `cum.sums` is provided.
`bias.correction`	A 'TRUE/FALSE' logical value indicating whether the bias correction should be applied to the estimation of the Gini index. The default value is `bias.correction = TRUE`.
`interval`	A character string specifying the type of variance estimation and confidence interval to be used, or `NULL` (the default value) to omit the computation of both variance and confidence interval. Possible values are `"zjackknife"`, `"tjackknife"`, `"zalinearization"`, `"zblinearization"`, `"talinearization"`, `"tblinearization"`, `"pbootstrap"`, `"BCa"`, `"ELchisq"` and `"ELboot"`. The default value is `interval = NULL`.
`B`	A single integer specifying the number of bootstrap replicates. The default value is `B = 1000L`.
`alpha`	A single numeric value between 0 and 1. If `interval` is not `NULL`, the confidence level to be used for computing the confidence interval for the Gini is `1-alpha`. Some authors call `alpha` the significance level. The default value is `alpha = 0.05`.
`cum.sums`	A vector with the non-negative real numbers specifying the cumulative sums of the variable used to estimate the Gini index. This argument can be `NULL` if argument `y` is provided. The default value is `cum.sums = NULL`.
`na.rm`	A 'TRUE/FALSE' logical value indicating whether `NA`'s should be removed before the computation proceeds. The default value is `na.rm = TRUE`.
`precisionEL`	A single numeric value specifying the precision for the confidence interval based on the empirical likelihood method. The default value is `precisionEL = 1e-4`, i.e., limits of the confidence interval have a total of 4 decimal places.
`maxiterEL`	A single integer specifying the maximal number of iterations allowed for the convergene of the empirical likelihood method. The default value is `maxiterEL = 100L`.
`large.sample`	A 'TRUE/FALSE' logical value indicating whether the sample is large to apply a faster algorithm to sort the sample values. The default value is `large.sample = FALSE`.

Details

For a sample S, with size n, derived from an infinite population, the Gini index is estimated by

\widehat{G} = \displaystyle \frac{2}{\overline{y}n^{2}}\sum_{i \in S}iy_{(i)} - \frac{n+1}{n}

when bias.correction = FALSE, and by

\widehat{G}^{bc} = \displaystyle \frac{2}{\overline{y}n(n-1)}\sum_{i \in S}iy_{(i)} - \frac{n+1}{n-1}

when bias.correction = TRUE. For more details, see Muñoz et al. (2023). The table below sumarises the various types of variances and confidence intervals that computes this function. Methods based on the jackknife technique use the fast algorithm suggested by Ogwang (2000). The linearization technique for variance estimation (Deville, 1999) has been applied to the following estimators of the Gini index (Berger, 2008; Langel and Tille, 2013):

\widehat{G}^{a} = \displaystyle \frac{1}{2\overline{y}n^{2}}\sum_{i \in S}\sum_{j\in S} |y_i-y_j|

and

\widehat{G}^{b} = \displaystyle \frac{2}{\overline{y}n}\sum_{i \in S}y_{i}\widehat{F}_{n}(y_{i}) - 1,

where

\widehat{F}_{n}(y_i)=\frac{1}{n}\sum_{j \in S}\delta(y_j \leq y_i).

zalinearization and zblinearization linearizate, respectively, the estimators \widehat{G}^{a} and \widehat{G}^{b}. The percentile bootstrap (see Qin et al., 2010) is computed using pbootstrap. Bca is the bias corrected bootstrap confidence interval (Efron and Tibshirani, 1993). ELchisq and ELboot are the confidence intervals based on the empirical likelihood method. The vignette vignette("GiniVarInterval") contains a detailed description of the various methods for variance estimation and confidence intervals for the Gini index.

Interval	Variance	Critical values	References
_______________	____________	__________________	__________________________
`zjackknife`	Jackknife	Normal	Berger (2008)
`tjackknife`	Jackknife	Studentized bootstrap	Biewen (2002); Berger (2008)
`zalinearization`	Linearization	Normal	Langel and Tille (2013)
`zblinearization`	Linearization	Normal	Berger (2008)
`talinearization`	Linearization	Studentized bootstrap	Langel and Tille (2013)
`tblinearization`	Linearization	Studentized bootstrap	Biewen (2002); Berger (2008)
`pBootstrap`	Bootstrap	Percentile bootstrap	Qin et al. (2010)
`BCa`	Bootstrap	BCa bootstrap	Davison and Hinkley (1997)
`ELchisq`	Linearization	Chi-Squared	Qin et al. (2010)
`ELboot`	Bootstrap	Percentile bootstrap	Qin et al. (2010)

Value

When interval = NULL, a single numeric value between 0 and 1, containing the estimation of the Gini index based on the vector y or the vector cum.sums. When interval is not NULL, a list of 3 components: a single numeric value with the estimation of the Gini index; a single numeric value with the variance estimation of the Gini index; and a numeric matrix with 1 row and 2 columns containing the lower and upper limits of the confidence intervals for the Gini index.

Author(s)

Juan F Munoz jfmunoz@ugr.es

Jose M Pavia pavia@uv.es

Encarnacion Alvarez encarniav@ugr.es

References

Berger, Y. G. (2008). A note on the asymptotic equivalence of jackknife and linearization variance estimation for the Gini Coefficient. Journal of Official Statistics, 24(4), 541-555.

Biewen, M. (2002). Bootstrap inference for inequality, mobility and poverty measurement. Journal of Econometrics, 108(2), 317-342.

Davison, A. C., and Hinkley, D. V. (1997). Bootstrap Methods and Their Application (Cambridge Series in Statistical and Probabilistic Mathematics, No 1)–Cambridge University Press.

Deville, J.C. (1999). Variance Estimation for Complex Statistics and Estimators: Linearization and Residual Techniques. Survey Methodology, 25, 193–203.

Efron, B. and Tibshirani, R. (1993). An Introduction to the Bootstrap. Chapman and Hall, New York, London.

Langel, M., and Tille, Y. (2013). Variance estimation of the Gini index: revisiting a result several times published. Journal of the Royal Statistical Society: Series A (Statistics in Society), 176(2), 521-540.

Muñoz, J. F., Moya-Fernández, P. J., and Álvarez-Verdejo, E. (2023). Exploring and Correcting the Bias in the Estimation of the Gini Measure of Inequality. Sociological Methods & Research. https://doi.org/10.1177/00491241231176847

Ogwang, T. (2000). A convenient method of computing the Gini index and its standard error. Oxford Bulletin of Economics and Statistics, 62(1), 123-123.

Qin, Y., Rao, J. N. K., and Wu, C. (2010). Empirical likelihood confidence intervals for the Gini measure of income inequality. Economic Modelling, 27(6), 1429-1435.

Examples

# Sample, with size 50, from a Lognormal distribution. The true Gini index is 0.5.
set.seed(123)
y <- gsample(n = 50, gini = 0.5, distribution = "lognormal")

# Bias corrected estimation of the Gini index.
igini(y)

# Estimation of the Gini index and confidence interval based on jackknife and studentized bootstrap.
igini(y, interval = "tjackknife")

giniVarCI documentation built on May 29, 2024, 3:36 a.m.