# icompareCI: Comparisons of variance estimators and confidence intervals... In giniVarCI: Gini Indices, Variances and Confidence Intervals for Finite and Infinite Populations

 icompareCI R Documentation

## Comparisons of variance estimators and confidence intervals for the Gini index in infinite populations

### Description

Compares variance estimates and confidence intervals for the Gini index in infinite populations.

### Usage

icompareCI(
y,
B = 1000L,
alpha = 0.05,
plotCI = TRUE,
digitsgini = 2L,
digitsvar = 4L,
cum.sums = NULL,
na.rm = TRUE,
precisionEL = 1e-4,
maxiterEL = 100L,
line.types = c(1L, 2L),
colors = c("red", "green"),
save.plot = FALSE
)


### Arguments

 y A vector with the non-negative real numbers to be used for estimating the Gini index. This argument can be missing if argument cum.sums is provided. B A single integer specifying the number of bootstrap replicates. The default value is B = 1000L. alpha A single numeric value between 0 and 1 specifying the confidence level 1-alpha to be used for computing the confidence interval for the Gini. Some authors call alpha the significance level. The default value is alpha = 0.05. plotCI A 'TRUE/FALSE' logical value indicating whether confidence intervals are compared using a plot. The default value is plotCI = TRUE. digitsgini A single integer specifying the number of decimals used in the estimation of the Gini index and confidence intervals. The default value is digitsgini = 2L. digitsvar A single integer specifying the number of decimals used in the variance estimation of the Gini index. The default value is digitsvar = 4L. cum.sums A numeric vector of non-negative real numbers specifying the cumulative sums of the variable used to estimate the Gini index. This argument can be NULL if argument y is provided. The default value is cum.sums = NULL. na.rm A 'TRUE/FALSE' logical value indicating whether the NA should be removed before the computation proceeds. The default value is na.rm = TRUE. precisionEL A single numeric value specifying the precision for the confidence interval based on the empirical likelihood method. The default value is precisionEL = 1e-4, i.e., limits of the confidence interval have a total of 4 decimal places. maxiterEL A single integer specifying the maximum number of iterations allowed for the convergence in the empirical likelihood method. The default value is maxiterEL = 100L. line.types A numeric vector with length equal 2 specifying the line types. See the function plot for the different line types. The default value is lty = c(1L,2L). colors A numeric vector with length equal 2 specifying the colors for lines of the plot. The default value is colors = c("red", "green"). save.plot A 'TRUE/FALSE' logical value indicating whether the ggplot object of the plot comparing the confidence intervals should be saved in the output. The default value is save.plot = FALSE.

### Details

For a sample S, with size n, derived from an infinite population, the Gini index is estimated by two different versions (see Muñoz et al., 2023 for more details):

\widehat{G} = \displaystyle \frac{2}{\overline{y}n^{2}}\sum_{i \in S}iy_{(i)} - \frac{n+1}{n};

\widehat{G}^{bc} = \displaystyle \frac{2}{\overline{y}n(n-1)}\sum_{i \in S}iy_{(i)} - \frac{n+1}{n-1},

where the label bc indicates that the bias correction is applied. The table below sumarises the various types of variances and confidence intervals that computes this function. Methods based on the jackknife technique use the fast algorithm suggested by Ogwang (2000). The linearization technique for variance estimation (Deville, 1999) has been applied to the following estimators of the Gini index (Berger, 2008; Langel and Tille, 2013):

\widehat{G}^{a} = \displaystyle \frac{1}{2\overline{y}n^{2}}\sum_{i \in S}\sum_{j\in S} |y_i-y_j|

and

\widehat{G}^{b} = \displaystyle \frac{2}{\overline{y}n}\sum_{i \in S}y_{i}\widehat{F}_{n}(y_{i}) - 1,

where

\widehat{F}_{n}(y_i)=\frac{1}{n}\sum_{j \in S}\delta(y_j \leq y_i).

zalinearization and zblinearization linearizate, respectively, the estimators \widehat{G}^{a} and \widehat{G}^{b}. The percentile bootstrap (see Qin et al., 2010) is computed using pbootstrap. Bca is the bias corrected bootstrap confidence interval (Efron and Tibshirani, 1993). ELchisq and ELboot are the confidence intervals based on the empirical likelihood method. The vignette vignette("GiniVarInterval") contains a detailed description of the various methods for variance estimation and confidence intervals for the Gini index.

 Interval Variance Critical values References _______________ ____________ __________________ __________________________ zjackknife Jackknife Normal Berger (2008) tjackknife Jackknife Studentized bootstrap Biewen (2002); Berger (2008) zalinearization Linearization Normal Langel and Tille (2013) zblinearization Linearization Normal Berger (2008) talinearization Linearization Studentized bootstrap Langel and Tille (2013) tblinearization Linearization Studentized bootstrap Biewen (2002); Berger (2008) pBootstrap Bootstrap Percentile bootstrap Qin et al. (2010) BCa Bootstrap BCa bootstrap Davison and Hinkley (1997) ELchisq Linearization Chi-Squared Qin et al. (2010) ELboot Bootstrap Percentile bootstrap Qin et al. (2010)

### Value

If save.plot = FALSE, a data frame with columns:

1. interval. The method used to construct the confidence interval.

2. bc. A 'TRUE/FALSE' logical value indicating whether the bias correction is applied.

3. gini. The estimation of the Gini index.

4. lowerlimit. The lower limit of the confidence interval.

5. upperlimit. The upper limit of the confidence interval.

6. var.gini. The variance estimation for the estimator of the Gini index.

If save.plot = TRUE, a list with two components: (i) 'base.CI' a data frame of six columns as just described and (ii) 'plot' a (ggplot) description of the plot, which is a list with components that contain the plot itself, the data, information about the scales, panels, etc. As a side-effect, a plot that compares the various methods for constructing confidence intervals for the Gini index is displayed. **ggplot2** is needed to be installed for this option to work.

If plotCI = TRUE, as a side-effect, a plot that compares the various methods for constructing confidence intervals for the Gini index is displayed. **ggplot2** is needed to be installed for this option to work.

### Author(s)

Juan F Munoz jfmunoz@ugr.es

Jose M Pavia pavia@uv.es

Encarnacion Alvarez encarniav@ugr.es

### References

Berger, Y. G. (2008). A note on the asymptotic equivalence of jackknife and linearization variance estimation for the Gini Coefficient. Journal of Official Statistics, 24(4), 541-555.

Biewen, M. (2002). Bootstrap inference for inequality, mobility and poverty measurement. Journal of Econometrics, 108(2), 317-342.

Davison, A. C., and Hinkley, D. V. (1997). Bootstrap Methods and Their Application (Cambridge Series in Statistical and Probabilistic Mathematics, No 1)–Cambridge University Press.

Deville, J.C. (1999). Variance Estimation for Complex Statistics and Estimators: Linearization and Residual Techniques. Survey Methodology, 25, 193–203.

Efron, B. and Tibshirani, R. (1993). An Introduction to the Bootstrap. Chapman and Hall, New York, London.

Langel, M., and Tille, Y. (2013). Variance estimation of the Gini index: revisiting a result several times published. Journal of the Royal Statistical Society: Series A (Statistics in Society), 176(2), 521-540.

Muñoz, J. F., Moya-Fernández, P. J., and Álvarez-Verdejo, E. (2023). Exploring and Correcting the Bias in the Estimation of the Gini Measure of Inequality. Sociological Methods & Research. https://doi.org/10.1177/00491241231176847

Ogwang, T. (2000). A convenient method of computing the Gini index and its standard error. Oxford Bulletin of Economics and Statistics, 62(1), 123-123.

Qin, Y., Rao, J. N. K., and Wu, C. (2010). Empirical likelihood confidence intervals for the Gini measure of income inequality. Economic Modelling, 27(6), 1429-1435.

igini, iginindex

### Examples

# Sample, with size 50, from a Lognormal distribution. The true Gini index is 0.5.
set.seed(123)
y <- gsample(n = 50, gini = 0.5, distribution = "lognormal")

# Estimation of the Gini index and confidence intervals using different methods.
icompareCI(y)


giniVarCI documentation built on May 29, 2024, 3:36 a.m.