Description Usage Format Details Author(s) Source Examples
Data on quantiles of the distributions of family incomes in the United States. This combines three data sources:
(1) US Census Table F-1 for the central quantiles
(2) Piketty and Saez for the 95th and higher quantiles
(3) Gross Domestic Product and implicit price deflators from MeasuringWorth.com
1 |
A data.frame
containing:
numeric year 1947:2012
number of families in the US
quintile1, quintile2, quintile3, quintile4, and p95 are the indicated quantiles of the distribution of family income from US Census Table F-1. The media is computed as the geometric mean of quintile2 and quintile3. This is accurate to the extent that the lognormal distribution adequately approximates the central 20 percent of the income distribution, which it should for most practical purposes.
The indicated quantiles of family income per Piketty and Saez
real GDP in millions, GDP implicit price deflators, US population in thousands, and real GDP per capita, according to MeasuringWorth.com.
ratio of the estimates of the 95th percentile of distributions of family income from the Piketty and Saez analysis of data from the Internal Revenue Service (IRS) and from the US Census Bureau.
The IRS has ranged between 72 and 98 percent of the Census Bureau figures for the 95th percentile of the distribution, with this ratio averaging around 75 percent since the late 1980s. However, this systematic bias is modest relative to the differences between the different quantiles of interest in this combined dataset.
average number of persons per family using the number of families from US Census Table F-1 and the population from MeasuringWorth.com.
personsPerFamily * realGDPperCap
ratio of realGDPperFamily to the median. This is a measure of skewness and income inequality.
For details on how this data.frame
was created, see
"F1.PikettySaez.R" in system.file('scripts', package='fda')
.
This provides links for files to download and R commands to read those
files and convert them into an updated version of
incomeInequality
. This is a reasonable thing to do if it is
more than 2 years since max(incomeInequality$year)
. All data
are in constant 2012 dollars.
Spencer Graves
United States Census Bureau, Table F-1. Income Limits for Each Fifth and Top 5 Percent of Families, All Races, http://www.census.gov/data/tables/time-series/demo/income-poverty/historical-income-inequality.html, accessed 2016-12-09.
Thomas Piketty and Emmanuel Saez (2003) "Income Inequality in the United States, 1913-1998", Quarterly Journal of Economics, 118(1) 1-39, http://elsa.berkeley.edu/~saez, update accessed February 28, 2014.
Louis Johnston and Samuel H. Williamson (2011) "What Was the U.S. GDP Then?" MeasuringWorth, http://www.measuringworth.org/usgdp, accessed February 28, 2014.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 | ##
## Rato of IRS to census estimates for the 95th percentile
##
data(incomeInequality)
plot(P95IRSvsCensus~Year, incomeInequality, type='b')
# starts ~0.74, trends rapidly up to ~0.97,
# then drifts back to ~0.75
abline(h=0.75)
abline(v=1989)
# check
sum(is.na(incomeInequality$P95IRSvsCensus))
# The Census data runs to 2011; Pikety and Saez runs to 2010.
quantile(incomeInequality$P95IRSvsCensus, na.rm=TRUE)
# 0.72 ... 0.98
##
## Persons per Family
##
plot(personsPerFamily~Year, incomeInequality, type='b')
quantile(incomeInequality$personsPerFamily)
# ranges from 3.72 to 4.01 with median 3.84
# -- almost 4
##
## GDP per family
##
plot(realGDPperFamily~Year, incomeInequality, type='b', log='y')
##
## Plot the mean then the first quintile, then the median,
## 99th, 99.9th and 99.99th percentiles
##
plotCols <- c(21, 3, 5, 11, 13:14)
kcols <- length(plotCols)
plotColors <- c(1:6, 8:13)[1:kcols] # omit 7=yellow
plotLty <- 1:kcols
matplot(incomeInequality$Year, incomeInequality[plotCols]/1000,
log='y', type='l', col=plotColors, lty=plotLty)
#*** Growth broadly shared 1947 - 1970, then began diverging
#*** The divergence has been most pronounced among the top 1%
#*** and especially the top 0.01%
##
## Growth rate by quantile 1947-1970 and 1970 - present
##
keyYears <- c(1947, 1970, 2010)
(iYears <- which(is.element(incomeInequality$Year, keyYears)))
(dYears <- diff(keyYears))
kk <- length(keyYears)
(lblYrs <- paste(keyYears[-kk], keyYears[-1], sep='-'))
(growth <- sapply(incomeInequality[iYears,], function(x, labels=lblYrs){
dxi <- exp(diff(log(x)))
names(dxi) <- labels
dxi
} ))
# as percent
(gr <- round(100*(growth-1), 1))
# The average annual income (realGDPperFamily) doubled between
# 1970 and 2010 (increased by 101 percent), while the median household
# income increased only 23 percent.
##
## Income lost by each quantile 1970-2010
## relative to the broadly shared growth 1947-1970
##
(lostGrowth <- (growth[, 'realGDPperFamily']-growth[, plotCols]))
# 1947-1970: The median gained 20% relative to the mean,
# while the top 1% lost ground
# 1970-2010: The median lost 79%, the 99th percentile lost 29%,
# while the top 0.1% gained
(lostIncome <- (lostGrowth[2, ] *
incomeInequality[iYears[2], plotCols]))
# The median family lost $39,000 per year in income
# relative to what they would have with the same economic growth
# broadly shared as during 1947-1970.
# That's slightly over $36,500 per year = $100 per day
(grYr <- growth^(1/dYears))
(grYr. <- round(100*(grYr-1), 1))
##
## Regression line: linear spline
##
(varyg <- c(3:14, 21))
Varyg <- names(incomeInequality)[varyg]
str(F01ps <- reshape(incomeInequality[c(1, varyg)], idvar='Year',
ids=F1.PikettySeaz$Year,
times=Varyg, timevar='pctile',
varying=list(Varyg), direction='long'))
names(F01ps)[2:3] <- c('variable', 'value')
F01ps$variable <- factor(F01ps$variable)
# linear spline basis function with knot at 1970
F01ps$t1970p <- pmax(0, F01ps$Year-1970)
table(nas <- is.na(F01ps$value))
# 6 NAs, one each of the Piketty-Saez variables in 2011
F01i <- F01ps[!nas, ]
# formula:
# log(value/1000) ~ b*Year + (for each variable:
# different intercept + (different slope after 1970))
Fit <- lm(log(value/1000)~Year+variable*t1970p, F01i)
anova(Fit)
# all highly significant
# The residuals may show problems with the model,
# but we will ignore those for now.
# Model predictions
str(Pred <- predict(Fit))
##
## Combined plot
##
# Plot to a file? Wikimedia Commons prefers svg format.
svg('incomeInequality8.svg')
# If you want software to convert svg to another format such as png,
# consider GIMP (www.gimp.org).
# Base plot
# Leave extra space on the right to label with growth since 1970
op <- par(mar=c(5, 4, 4, 5)+0.1)
matplot(incomeInequality$Year, incomeInequality[plotCols]/1000,
log='y', type='l', col=plotColors, lty=plotLty,
xlab='', ylab='', las=1, axes=FALSE, lwd=3)
axis(1, at=seq(1950, 2010, 10),
labels=c(1950, NA, 1970, NA, 1990, NA, 2010), cex.axis=1.5)
yat <- c(10, 50, 100, 500, 1000, 5000, 10000)
axis(2, yat, labels=c('$10K', '$50K', '$100K', '$500K',
'$1M', '$5M', '$10M'), las=1, cex.axis=1.2)
# Label the lines
pctls <- paste(c(20, 40, 50, 60, 80, 90, 95, 99, 99.5, 99.9, 99.99),
'%', sep='')
lineLbl0 <- c('Year', 'families K', pctls,
'realGDP.M', 'GDP deflator', 'pop-K', 'realGDPperFamily',
'95 pct(IRS / Census)', 'size of household',
'average family income', 'mean/median')
(lineLbls <- lineLbl0[plotCols])
sel75 <- (incomeInequality$Year==1975)
laby <- incomeInequality[sel75, plotCols]/1000
text(1973.5, c(1.2, 1.2, 1.3, 1.5, 1.9)*laby[-1], lineLbls[-1], cex=1.2)
text(1973.5, 1.2*laby[1], lineLbls[1], cex=1.2, srt=10)
##
## Add lines + points for the knots in 1970
##
End <- numeric(kcols)
F01names <- names(incomeInequality)
for(i in seq(length=kcols)){
seli <- (as.character(F01i$variable) == F01names[plotCols[i]])
# with(F01i[seli, ], lines(Year, exp(Pred[seli]), col=plotColors[i]))
yri <- F01i$Year[seli]
predi <- exp(Pred[seli])
lines(yri, predi, col=plotColors[i])
End[i] <- predi[length(predi)]
sel70i <- (yri==1970)
points(yri[sel70i], predi[sel70i], col=plotColors[i])
}
##
## label growth rates
##
table(sel70. <- (incomeInequality$Year>1969))
(lastYrs <- incomeInequality[sel70., 'Year'])
(lastYr. <- max(lastYrs)+4)
#text(lastYr., End, gR., xpd=NA)
text(lastYr., End, paste(gr[2, plotCols], '%', sep=''), xpd=NA)
text(lastYr.+7, End, paste(grYr.[2, plotCols], '%', sep=''), xpd=NA)
##
## Label the presidents
##
abline(v=c(1953, 1961, 1969, 1977, 1981, 1989, 1993, 2001, 2009))
(m99.95 <- with(incomeInequality, sqrt(P99.9*P99.99))/1000)
text(1949, 5000, 'Truman')
text(1956.8, 5000, 'Eisenhower', srt=90)
text(1963, 5000, 'Kennedy', srt=90)
text(1966.8, 5000, 'Johnson', srt=90)
text(1971, 5*m99.95[24], 'Nixon', srt=90)
text(1975, 5*m99.95[28], 'Ford', srt=90)
text(1978.5, 5*m99.95[32], 'Carter', srt=90)
text(1985.1, m99.95[38], 'Reagan' )
text(1991, 0.94*m99.95[44], 'GHW Bush', srt=90)
text(1997, m99.95[50], 'Clinton')
text(2005, 1.1*m99.95[58], 'GW Bush', srt=90)
text(2010, 1.2*m99.95[62], 'Obama', srt=90)
##
## Done
##
par(op) # reset margins
dev.off() # for plot to a file
|
Loading required package: Ecfun
Attaching package: 'Ecfun'
The following object is masked from 'package:base':
sign
Attaching package: 'Ecdat'
The following object is masked from 'package:datasets':
Orange
[1] 0
0% 25% 50% 75% 100%
0.7319601 0.7637343 0.8178705 0.8905550 0.9892483
0% 25% 50% 75% 100%
3.722238 3.798983 3.844766 3.890913 4.006411
[1] 1 24 64
[1] 23 40
[1] "1947-1970" "1970-2010"
Year Number.thousands quintile1 quintile2 median quintile3
1947-1970 1.011813 1.402557 1.889560 1.910242 1.911692 1.913143
1970-2010 1.020305 1.523331 1.037714 1.151304 1.226677 1.306985
quintile4 p95 P90 P95 P99 P99.5 P99.9
1947-1970 1.853266 1.763044 2.132507 2.070092 1.597485 1.474057 1.305936
1970-2010 1.457630 1.647526 1.284014 1.410992 1.699962 1.822060 2.494717
P99.99 realGDP.M GDP.Deflator PopulationK realGDPperCap
1947-1970 1.333545 2.434816 1.770543 1.422984 1.711064
1970-2010 3.914579 3.132755 4.431699 1.510447 2.074059
P95IRSvsCensus personsPerFamily realGDPperFamily mean.median
1947-1970 1.1741578 1.0145644 1.735984 0.908088
1970-2010 0.8564303 0.9915421 2.056517 1.676494
Year Number.thousands quintile1 quintile2 median quintile3 quintile4
1947-1970 1.2 40.3 89.0 91.0 91.2 91.3 85.3
1970-2010 2.0 52.3 3.8 15.1 22.7 30.7 45.8
p95 P90 P95 P99 P99.5 P99.9 P99.99 realGDP.M GDP.Deflator
1947-1970 76.3 113.3 107.0 59.7 47.4 30.6 33.4 143.5 77.1
1970-2010 64.8 28.4 41.1 70.0 82.2 149.5 291.5 213.3 343.2
PopulationK realGDPperCap P95IRSvsCensus personsPerFamily
1947-1970 42.3 71.1 17.4 1.5
1970-2010 51.0 107.4 -14.4 -0.8
realGDPperFamily mean.median
1947-1970 73.6 -9.2
1970-2010 105.7 67.6
realGDPperFamily quintile1 median P99 P99.9
1947-1970 0 -0.1535755 -0.1757074 0.1384989 0.4300485
1970-2010 0 1.0188031 0.8298398 0.3565554 -0.4381997
P99.99
1947-1970 0.4024393
1970-2010 -1.8580616
realGDPperFamily quintile1 median P99 P99.9 P99.99
58 0 27419.05 42458.58 76561.71 -274125.4 -3926103
Year Number.thousands quintile1 quintile2 median quintile3
1947-1970 1.000511 1.014817 1.028053 1.028540 1.028574 1.028608
1970-2010 1.000503 1.010578 1.000926 1.003529 1.005121 1.006716
quintile4 p95 P90 P95 P99 P99.5 P99.9
1947-1970 1.027187 1.02496 1.033474 1.032140 1.020575 1.017013 1.011673
1970-2010 1.009465 1.01256 1.006269 1.008644 1.013354 1.015112 1.023118
P99.99 realGDP.M GDP.Deflator PopulationK realGDPperCap
1947-1970 1.012593 1.039448 1.025150 1.015455 1.023628
1970-2010 1.034706 1.028959 1.037921 1.010363 1.018405
P95IRSvsCensus personsPerFamily realGDPperFamily mean.median
1947-1970 1.0070049 1.0006289 1.024271 0.9958169
1970-2010 0.9961329 0.9997877 1.018189 1.0130014
Year Number.thousands quintile1 quintile2 median quintile3 quintile4
1947-1970 0.1 1.5 2.8 2.9 2.9 2.9 2.7
1970-2010 0.1 1.1 0.1 0.4 0.5 0.7 0.9
p95 P90 P95 P99 P99.5 P99.9 P99.99 realGDP.M GDP.Deflator PopulationK
1947-1970 2.5 3.3 3.2 2.1 1.7 1.2 1.3 3.9 2.5 1.5
1970-2010 1.3 0.6 0.9 1.3 1.5 2.3 3.5 2.9 3.8 1.0
realGDPperCap P95IRSvsCensus personsPerFamily realGDPperFamily
1947-1970 2.4 0.7 0.1 2.4
1970-2010 1.8 -0.4 0.0 1.8
mean.median
1947-1970 -0.4
1970-2010 1.3
[1] 3 4 5 6 7 8 9 10 11 12 13 14 21
'data.frame': 858 obs. of 3 variables:
$ Year : num 1947 1948 1949 1950 1951 ...
$ pctile : chr "quintile1" "quintile1" "quintile1" "quintile1" ...
$ quintile1: num 14243 13779 13007 13829 15070 ...
- attr(*, "reshapeLong")=List of 4
..$ varying:List of 1
.. ..$ : chr "quintile1" "quintile2" "median" "quintile3" ...
..$ v.names: NULL
..$ idvar : chr "Year"
..$ timevar: chr "pctile"
FALSE
858
Analysis of Variance Table
Response: log(value/1000)
Df Sum Sq Mean Sq F value Pr(>F)
Year 1 95.64 95.644 10577.68 < 2.2e-16 ***
variable 12 1456.23 121.353 13420.87 < 2.2e-16 ***
t1970p 1 2.09 2.090 231.10 < 2.2e-16 ***
variable:t1970p 12 11.73 0.978 108.15 < 2.2e-16 ***
Residuals 831 7.51 0.009
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Named num [1:858] 2.64 2.67 2.7 2.72 2.75 ...
- attr(*, "names")= chr [1:858] "1947.quintile1" "1948.quintile1" "1949.quintile1" "1950.quintile1" ...
[1] "mean/median" "20%" "50%" "99.5%" "99.99%"
[6] "realGDP.M"
FALSE TRUE
23 43
[1] 1970 1971 1972 1973 1974 1975 1976 1977 1978 1979 1980 1981 1982 1983 1984
[16] 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999
[31] 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012
[1] 2016
[1] 871.2137 922.2424 848.5531 945.8757 971.7336 880.5765 807.2338
[8] 926.6332 1040.6987 1033.6739 956.3776 948.8742 1083.6977 1032.1129
[15] 1177.9390 1063.9234 1071.9227 1180.6443 1290.6369 1283.6370 1449.8320
[22] 1634.0489 1479.0661 1149.7130 1234.6856 1347.6211 1249.7151 1205.5715
[29] 1046.2963 1085.2276 1138.7454 1141.4491 1455.8189 1424.5070 1455.4575
[36] 1635.7588 1777.3875 1952.8700 2163.5463 3007.4595 1945.9916 2691.9634
[43] 2404.1387 2311.9702 2042.5027 2348.6435 2212.3710 2261.3654 2518.4364
[50] 2932.0779 3443.3017 3949.2213 4393.0757 4906.7193 3653.5077 3130.6405
[57] 3240.6360 4003.5007 4779.4529 5095.1143 5375.0219 4197.7265 3158.4421
[64] 3592.8788 3584.4502 4421.4100
png
2
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.