lorenz: Lorenz Curve Based Thresholds and Partitions

View source: R/lorenz.R

lorenzR Documentation

Lorenz Curve Based Thresholds and Partitions

Description

Lorenz curve based thresholds and partitions.

Usage

lorenz(x, n = rep(1, length(x)), na.last = TRUE)

## S3 method for class 'lorenz'
quantile(x, probs = seq(0, 1, 0.25),
    type = c("L", "p"), ...)
iquantile(x, ...)
## S3 method for class 'lorenz'
iquantile(x, values,
    type = c("L", "p"),...)

## S3 method for class 'lorenz'
plot(x, type = c("L", "x"),
    tangent = NA, h = NA, v = NA, ...)

## S3 method for class 'summary.lorenz'
print(x, digits, ...)
## S3 method for class 'lorenz'
summary(object, ...)

Arguments

x

a vector of nonnegative numbers for lorenz, or an object to plot or summarized.

n

a vector of frequencies, must be same length as x.

na.last

logical, for controlling the treatment of NAs. If TRUE, missing values in the data are put last; if FALSE, they are put first; if NA, they are removed (see order).

probs

numeric vector of probabilities with values in [0,1], as in quantile.

values

numeric vector of values for which the corresponding population quantiles are to be returned.

type

character. For the plot method it indicates whether to plot the cumulative distribution quantiles ("L") or ordered but not-cumulated values ("x"). For the quantile and iquantile methods it indicates which of the quantiles ("L" or "p") to use.

tangent

color value for the Lorenz-curve tangent when plotted. The default NA value omits the tangent from the plot.

h

color value for the horizontal line for the Lorenz-curve tangent when plotted. The default NA value omits the horizontal line from the plot.

v

color value for the vertical line for the Lorenz-curve tangent when plotted. The default NA value omits the vertical line from the plot.

digits

numeric, number of significant digits in output.

object

object to summarize.

...

other arguments passed to the underlying functions.

Details

The Lorenz curve is a continuous piecewise linear function representing the distribution of abundance (income, or wealth). Cumulative portion of the population: p_i = i / m (i=1,...,m), vs. cumulative portion of abundance: L_i = sum_j=1^i x_j * n_j / sum_j=1^n x_j * n_j. where x_i are indexed in non-decreasing order (x_i <= x_i+1). By convention, p_0 = L_0 = 0. n can represent unequal frequencies.

The following charactersitics of the Lorenz curve are calculated: "t": index where tangent (slope 1) touches the curve; "x[t]", "p[t]", and "L[t]" are values corresponding to index t, x_t is the unmodified input. "S": Lorenz asymmetry coefficient (S = p_t + L_t), S = 1 indicates symmetry. "G": Gini coefficient, 0 is perfect equality, values close to 1 indicate high inequality. "J": Youden index is the (largest) distance between the anti-diagonal and the curve, distance is largest at the tangent point (J = max(p - L) = p_t - L_t).

Value

lorenz returns an object of class lorenz. It is a matrix with m+1 rows (m = length(x)) and 3 columns (p, L, x).

The quantile method finds values of x_i corresponding to quantiles L_i or p_i (depending on the type argument). The iquantile (inverse quantile) method finds quantiles of L_i or p_i corresponding to values of x_i.

The plot method draws a Lorenz curve. Because the object is a matrix, lines and points will work for adding multiple lines.

The summary method returns characteristics of the Lorenz curve.

Author(s)

Peter Solymos <solymos@ualberta.ca>

References

Damgaard, C., & Weiner, J. (2000): Describing inequality in plant size or fecundity. Ecology 81:1139–1142. <doi:10.2307/177185>

Schisterman, E. F., Perkins, N. J., Liu, A., & Bondell, H. (2005): Optimal cut-point and its corresponding Youden index to discriminate individuals using pooled blood samples. Epidemiology 16:73–81. <doi:10.1097/01.ede.0000147512.81966.ba>

Youden, W. J. (1950): Index for rating diagnostic tests. Cancer 3:32–5. <doi:10.1002/1097-0142(1950)3:1<32::AID-CNCR2820030106>3.0.CO;2-3>

See Also

quantile, order.

Examples

set.seed(1)
x <- c(rexp(100, 10), rexp(200, 1))

l <- lorenz(x)
head(l)
tail(l)
summary(l)
summary(unclass(l))

(q <- c(0.05, 0.5, 0.95))
(p_i <- quantile(l, probs=q, type="p"))
iquantile(l, values=p_i, type="p")
(p_i <- quantile(l, probs=q, type="L"))
iquantile(l, values=p_i, type="L")

op <- par(mfrow=c(2,1))
plot(l, lwd=2, tangent=2, h=3, v=4)
abline(0, 1, lty=2, col="grey")
abline(1, -1, lty=2, col="grey")
plot(l, type="x", lwd=2, h=3, v=4)
par(op)

## Lorenz-tangent approach to binarize a multi-level problem
n <- 100
g <- as.factor(sort(sample(LETTERS[1:4], n, replace=TRUE, prob=4:1)))
x <- rpois(n, exp(as.integer(g)))
mu <- aggregate(x, list(g), mean)
(l <- lorenz(mu$x, table(g)))
(s <- summary(l))

plot(l)
abline(0, 1, lty=2)
lines(rep(s["p[t]"], 2), c(s["p[t]"], s["L[t]"]), col=2)

psolymos/opticut documentation built on Nov. 27, 2022, 11:29 a.m.