svylorenz: Lorenz curve

View source: R/svylorenz.R

svylorenzR Documentation

Lorenz curve

Description

Estimate the Lorenz curve, an inequality graph

Usage

svylorenz(formula, design, ...)

## S3 method for class 'survey.design'
svylorenz(
  formula,
  design,
  quantiles = seq(0, 1, 0.1),
  empirical = FALSE,
  plot = TRUE,
  add = FALSE,
  curve.col = "red",
  ci = TRUE,
  alpha = 0.05,
  na.rm = FALSE,
  deff = FALSE,
  linearized = FALSE,
  influence = FALSE,
  ...
)

## S3 method for class 'svyrep.design'
svylorenz(
  formula,
  design,
  quantiles = seq(0, 1, 0.1),
  empirical = FALSE,
  plot = TRUE,
  add = FALSE,
  curve.col = "red",
  ci = TRUE,
  alpha = 0.05,
  na.rm = FALSE,
  deff = FALSE,
  linearized = FALSE,
  return.replicates = FALSE,
  ...
)

## S3 method for class 'DBIsvydesign'
svylorenz(formula, design, ...)

Arguments

formula

a formula specifying the income variable

design

a design object of class survey.design or class svyrep.design from the survey library.

...

additional arguments passed to plot methods

quantiles

a sequence of probabilities that defines the quantiles sum to be calculated

empirical

Should an empirical Lorenz curve be estimated as well? Defaults to FALSE.

plot

Should the Lorenz curve be plotted? Defaults to TRUE.

add

Should a new curve be plotted on the current graph?

curve.col

a string defining the color of the curve.

ci

Should the confidence interval be plotted? Defaults to TRUE.

alpha

a number that especifies de confidence level for the graph.

na.rm

Should cases with missing values be dropped? Defaults to FALSE.

deff

Return the design effect (see survey::svymean)

linearized

Should a matrix of linearized variables be returned

influence

Should a matrix of (weighted) influence functions be returned? (for compatibility with svyby)

return.replicates

Return the replicate estimates?

Details

you must run the convey_prep function on your survey design object immediately after creating it with the svydesign or svrepdesign function.

Notice that the 'empirical' curve is observation-based and is the one actually used to calculate the Gini index. On the other hand, the quantile-based curve is used to estimate the shares, SEs and confidence intervals.

This way, as the number of quantiles of the quantile-based function increases, the quantile-based curve approacches the observation-based curve.

Value

Object of class "survey::oldsvyquantile", which are vectors with a "quantiles" attribute giving the proportion of income below that quantile, and a "SE" attribute giving the standard errors of the estimates.

Author(s)

Guilherme Jacob, Djalma Pessoa and Anthony Damico

References

Milorad Kovacevic and David Binder (1997). Variance Estimation for Measures of Income Inequality and Polarization - The Estimating Equations Approach. Journal of Official Statistics, Vol.13, No.1, 1997. pp. 41 58. URL https://www.scb.se/contentassets/ca21efb41fee47d293bbee5bf7be7fb3/variance-estimation-for-measures-of-income-inequality-and-polarization—the-estimating-equations-approach.pdf.

Shlomo Yitzhaki and Robert Lerman (1989). Improving the accuracy of estimates of Gini coefficients. Journal of Econometrics, Vol.42(1), pp. 43-47, September.

Matti Langel (2012). Measuring inequality in finite population sampling. PhD thesis. URL http://doc.rero.ch/record/29204.

See Also

oldsvyquantile

Examples


library(survey)
library(laeken)
data(eusilc) ; names( eusilc ) <- tolower( names( eusilc ) )

# linearized design
des_eusilc <- svydesign( ids = ~rb030 , strata = ~db040 ,  weights = ~rb050 , data = eusilc )
des_eusilc <- convey_prep( des_eusilc )
svylorenz( ~eqincome , des_eusilc, seq(0,1,.05), alpha = .01 )

# replicate-weighted design
des_eusilc_rep <- as.svrepdesign( des_eusilc , type = "bootstrap" )
des_eusilc_rep <- convey_prep( des_eusilc_rep )

svylorenz( ~eqincome , des_eusilc_rep, seq(0,1,.05), alpha = .01 )

## Not run: 

# linearized design using a variable with missings
svylorenz( ~py010n , des_eusilc, seq(0,1,.05), alpha = .01 )
svylorenz( ~py010n , des_eusilc, seq(0,1,.05), alpha = .01, na.rm = TRUE )
# demonstration of `curve.col=` and `add=` parameters
svylorenz( ~eqincome , des_eusilc, seq(0,1,.05), alpha = .05 , add = TRUE , curve.col = 'green' )
# replicate-weighted design using a variable with missings
svylorenz( ~py010n , des_eusilc_rep, seq(0,1,.05), alpha = .01 )
svylorenz( ~py010n , des_eusilc_rep, seq(0,1,.05), alpha = .01, na.rm = TRUE )



# database-backed design
library(RSQLite)
library(DBI)
dbfile <- tempfile()
conn <- dbConnect( RSQLite::SQLite() , dbfile )
dbWriteTable( conn , 'eusilc' , eusilc )

dbd_eusilc <-
	svydesign(
		ids = ~rb030 ,
		strata = ~db040 ,
		weights = ~rb050 ,
		data="eusilc",
		dbname=dbfile,
		dbtype="SQLite"
	)

dbd_eusilc <- convey_prep( dbd_eusilc )

svylorenz( ~eqincome , dbd_eusilc, seq(0,1,.05), alpha = .01 )

# highlithing the difference between the quantile-based curve and the empirical version:
svylorenz( ~eqincome , dbd_eusilc, seq(0,1,.5), empirical = TRUE, ci = FALSE, curve.col = "green" )
svylorenz( ~eqincome , dbd_eusilc, seq(0,1,.5), alpha = .01, add = TRUE )
legend( "topleft", c("Quantile-based", "Empirical"), lwd = c(1,1), col = c("red", "green"))
# as the number of quantiles increases, the difference between the curves gets smaller
svylorenz( ~eqincome , dbd_eusilc, seq(0,1,.01), empirical = TRUE, ci = FALSE, curve.col = "green" )
svylorenz( ~eqincome , dbd_eusilc, seq(0,1,.01), alpha = .01, add = TRUE )
legend( "topleft", c("Quantile-based", "Empirical"), lwd = c(1,1), col = c("red", "green"))

dbRemoveTable( conn , 'eusilc' )

dbDisconnect( conn , shutdown = TRUE )


## End(Not run)


convey documentation built on Oct. 16, 2024, 5:10 p.m.