Pareto tail modeling for income distributions
Description
Fit a Pareto distribution to the upper tail of income data. Since a theoretical distribution is used for the upper tail, this is a semiparametric approach.
Usage
1 2 
Arguments
x 
a numeric vector. 
k 
the number of observations in the upper tail to which the Pareto distribution is fitted. 
x0 
the threshold (scale parameter) above which the Pareto distribution is fitted. 
method 
either a function or a character string specifying the function
to be used to estimate the shape parameter of the Pareto distibution, such as

groups 
an optional vector or factor specifying groups of elements of

w 
an optional numeric vector giving sample weights. 
alpha 
numeric; values above the theoretical 1  
... 
addtional arguments to be passed to the specified method. 
Details
The arguments k
and x0
of course correspond with each other.
If k
is supplied, the threshold x0
is estimated with the n
 k largest value in x
, where n is the number of observations.
On the other hand, if the threshold x0
is supplied, k
is given
by the number of observations in x
larger than x0
. Therefore,
either k
or x0
needs to be supplied. If both are supplied,
only k
is used.
The function supplied to method
should take a numeric vector (the
observations) as its first argument. If k
is supplied, it will be
passed on (in this case, the function is required to have an argument called
k
). Similarly, if the threshold x0
is supplied, it will be
passed on (in this case, the function is required to have an argument called
x0
). As above, only k
is passed on if both are supplied. If
the function specified by method
can handle sample weights, the
corresponding argument should be called w
. Additional arguments are
passed via the ... argument.
Value
An object of class "paretoTail"
with the following components:
x 
the supplied numeric vector. 
k 
the number of observations in the upper tail to which the Pareto distribution has been fitted. 
groups 
if supplied, the vector or factor specifying groups of elements. 
w 
if supplied, the numeric vector of sample weights. 
method 
the function used to estimate the shape parameter, or the name of the function. 
x0 
the scale parameter. 
theta 
the estimated shape parameter. 
tail 
if 
alpha 
the tuning parameter 
out 
if 
Author(s)
Andreas Alfons
References
A. Alfons and M. Templ (2013) Estimation of Social Exclusion Indicators from Complex Surveys: The R Package laeken. Journal of Statistical Software, 54(15), 1–25. URL http://www.jstatsoft.org/v54/i15/
A. Alfons, M. Templ, P. Filzmoser (2013) Robust estimation of economic indicators from survey samples based on Pareto tail modeling. Journal of the Royal Statistical Society, Series C, 62(2), 271–286.
See Also
reweightOut
, shrinkOut
,
replaceOut
, replaceTail
, fitPareto
thetaPDC
, thetaWML
, thetaHill
,
thetaISE
, thetaLS
, thetaMoment
,
thetaQQ
, thetaTM
Examples
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32  data(eusilc)
## gini coefficient without Pareto tail modeling
gini("eqIncome", weights = "rb050", data = eusilc)
## gini coefficient with Pareto tail modeling
# estimate threshold
ts < paretoScale(eusilc$eqIncome, w = eusilc$db090,
groups = eusilc$db030)
# estimate shape parameter
fit < paretoTail(eusilc$eqIncome, k = ts$k,
w = eusilc$db090, groups = eusilc$db030)
# calibration of outliers
w < reweightOut(fit, calibVars(eusilc$db040))
gini(eusilc$eqIncome, w)
# winsorization of outliers
eqIncome < shrinkOut(fit)
gini(eqIncome, weights = eusilc$rb050)
# replacement of outliers
eqIncome < replaceOut(fit)
gini(eqIncome, weights = eusilc$rb050)
# replacement of whole tail
eqIncome < replaceTail(fit)
gini(eqIncome, weights = eusilc$rb050)
