fitgroup.f: Estimation of the Fisk distribution from group data

Description Usage Arguments Details Value References Examples

View source: R/fitgroup_f.R

Description

The function fitgroup.f implements the estimation of the Fisk distribution from group data in form of income shares using the equally weighted minimum distance (EWMD) and the optimally weighted minimum distance (OMD) estimators.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
fitgroup.f(
  y,
  x = rep(1/length(y), length(y)),
  gini.e,
  pc.inc = NULL,
  se.omd = FALSE,
  se.ewmd = FALSE,
  se.scale = FALSE,
  N = NULL,
  nrep = 10^3,
  grid = 1:20,
  rescale = 1000,
  gini = FALSE
)

Arguments

y

Vector of (non-cumulative) income shares expressed as decimals or percentage. At least four data points are required to estimate the parameters of the income distribution.

x

Vector of population shares associated with the income shares provided by y. The default is a vector of equally sized population shares of the same length of y.

gini.e

specifies the survey Gini index expressed as a decimal.

pc.inc

specifies an estimate of per capita income. If not provided, the weighting matrix cannot be computed, hence OMD estimates will not be reported.

se.omd

If TRUE and the argument N is not NULL, the standard errors of the shape parameters of the OMD estimates are computed using results from Beach and Davison(1983) and Hajargasht and Griffiths (2016).See Jorda et al. (2018) for details. By default, this argument is FALSE.

se.ewmd

If TRUE and the argument N is not NULL, the standard errors of the EWMD estimates are obtained using Monte Carlo simulation of random samples of size N. By default, this argument is FALSE.

se.scale

If TRUE and the argument N is not NULL, the standard error of the scale parameter of the OMD estimation is obtained by Monte Carlo simulation of random samples of size N. By default, this argument is FALSE.

N

Specifies the size of the sample from which the grouped data was generated. This information is required to compute the standard errors.

nrep

Number of samples to be drawn in the Monte Carlo simulation of the standard error of the EWMD estimates and the scale parameter of the OMD estimation.

grid

A sequence of positive real numbers to be used as initial values using the algorithm developed by Jorda et al. (2018).

rescale

Rescalation factor of per capita income. Reescalation might help to invert the weight matrix when the scale is too large or too small. The argument rescale should be a positive real number which, by default, is set to 1000. The magnitude of this factor is taken into account in the estimation of the scale parameter, so the provided estimate and its standard error are equivalent to those obtained with rescale = 1.

gini

if TRUE, reports an estimate of the Gini index using the EWMD estimator and, if possible, the OMD estimator.

Details

The Generalised Beta of the Second Kind (GB2) is a general class of distributions that is acknowledged to provide an accurate fit to income data (McDonald 1984; McDonald and Mantrala,1995). The Fisk distribution is a particular case of this model with p = q = 1, defined in terms of the cumulative distribution function as follows:

F(x; a, b) = \bigg(1-\bigg(\frac{x}{b}\bigg)^{a}\bigg)^{-1}

where b is the scale parameter and a is the shape parameter.

The function fitgroup.f estimates the parameters of the Fisk distribution using grouped data in form of income shares. These data must have been generated by setting the proportion of observations in each group before sampling, so that the population proportions are fixed, whereas income shares are random variables. Examples of this type of data can be found in the largest datasets of grouped data, including The World Income Inequality Database (UNU-WIDER, 2017), PovcalNet (World Bank, 2018) or the World Wealth and Income Database (Alvaredo et al., 2018).

For EWMD estimators, numerical optimisation is achieved using the Levenberg-Marquardt Algorithm via nlsLM. We use the moment estimate of the a parameter, obtained by equating the sample Gini index specified by gini.e to the population Gini index, as initial value. This method, however, does not provide an estimate for the scale parameter because the Lorenz curve is independent to scale. The scale parameter is estimated by equating the sample mean, specified by pc.inc, to the population mean of the Fisk distribution. Because EWMD does not use the optimal covariance matrix of the moment conditions, the standard errors of the parameters are obtained by Monte Carlo simulation. Please be aware that the estimation of the standard errors might take a long time, especially if the sample size is large.

fitgroup.f also implements a two-stage OMD estimator. In the first stage, EWMD estimates are obtained as described above, which are used to compute a first stage estimator of the weighting matrix. The weighting matrix is used in the second stage to obtain optimally weighted estimates of the parameters. The numerical optimisation is performed using optim with the BFGS method. If optim reports an error, the L-BFGS method is used. EWMD estimates are used as initial values for the optimisation algorithm. The OMD estimation incorporates the optimal weight matrix, thus making possible to derive the asymptotic standard errors of the parameters using results from Beach and Davison(1983) and Hajargasht and Griffiths (2016). As in the EWMD estimation, the scale parameter is obtained by matching the population mean of the Fisk distribution to the sample mean. Hence, the standard error of the scale parameter is estimated by Monte Carlo simulation.

The Gini index of the Fisk distribution is computed using the function simgini.f.If this function reports a value greater than 1, the Gini index is estimated by Monte Carlo simulation of 10^6 samples of size N = 10^6.

Value

the function fitgroup.f returns the following objects:

References

Alvaredo, F., A. Atkinson, T. Piketty, E. Saez, and G. Zucman. The World Wealth and Income Database.

Beach, C.M. and R. Davidson (1983): Distribution-free statistical inference with Lorenz curves and income shares, The Review of Economic Studies, 50, 723 - 735.

Hajargasht, G. and W.E. Griffiths (2016): Inference for Lorenz Curves, Tech. Rep., The University of Melbourne.

Jorda, V., Sarabia, J.M., & Jäntti, M. (2018). Estimation of income inequality from grouped data. arXiv preprint arXiv:1808.09831.

McDonald, J.B. (1984): Some Generalized Functions for the Size Distribution of Income, Econometrica, 52, 647 - 665.

McDonald, J.B. and A. Mantrala (1995): The distribution of personal income: revisited, Journal of Applied Econometrics, 10, 201 - 204.

UNU-WIDER (2018). World Income Inequality Database (WIID3.4).

World Bank (2018). PovcalNet Data Base. Washington, DC: World Bank. http://iresearch.worldbank.org/PovcalNet/home.aspx.

Examples

1
fitgroup.f(y = c(9, 13, 17, 22, 39), gini.e = 0.29)

GB2group documentation built on Jan. 26, 2021, 5:06 p.m.