The purpose of cepumd is to make working with Consumer Expenditure Surveys (CE) Public-Use Microdata (PUMD) easier toward calculating mean, weighted, annual expenditures (henceforth “mean expenditures”). The challenges cepumd seeks to address deal primarily with pulling together the necessary data toward this end. Some of the overarching ideas underlying the package are as follows:
Use a Tidyverse framework for most operations and be (hopefully) generally Tidyverse friendly
Balance the effort to make the end user’s experience with CE PUMD easier while being flexible enough to allow that user to perform any analysis with the data they wish
Only designed to help users calculate mean expenditures on and of the consumer unit (CU), i.e., not income, not assets, not liabilities, not gifts.
cepumd
cepumd
seeks to address challenges in three categories: data
gathering/organization; managing data inconsistencies; and calculating
weighted, annual metrics.
ce_hg()
ce_hg()
and
ce_uccs()
ce_prepdata()
ce_mean()
or expenditure
quantile with ce_quantile()
Install the production version with install.packages("cepumd")
You can install the development version of cepumd
from
GitHub, but you’ll first need the devtools
package:
if (!"devtools" %in% installed.packages()[, "Package"]) {
install.packages("devtools", dependencies = TRUE)
}
devtools::install_github("arcenis-r/cepumd")
The workhorse of cepumd
is ce_prepdata()
. It merges the household
characteristics file (FMLI/-D) with the corresponding expenditure
tabulation file (MTBI/EXPD) for a specified year, adjusts weights for
months-in-scope and the number of collection quarters, adjusts some
cost values by their periodicity factor (some cost categories are
represented as annual figures and others as quarterly). With the
recent update it only requires the first 3 arguments to function: the
year, the survey type, and one or more valid UCCs. ce_prepdata()
now
creates all of the other necessary objects within the function if not
provided.
There are two functions for wrangling hierarchical grouping data into more usable formats:
ce_hg()
pulls the requested type of HG file (Interview, Diary, or
Integrated) for a specified year.
ce_uccs()
filters the HG file for the specified expenditure
category and returns either a data frame with only that section of
the HG file or the Universal Classification Codes (UCCs) that make
up that expenditure category.
There are two functions that the user can use to calculate CE summary statistics:
ce_mean()
calculates a mean expenditure, standard error of the
mean, coefficient of variation, and an aggregate expenditure.
ce_quantiles()
calculates weighted expenditure quantiles. It is
important to note that calculating medians for integrated
expenditures is not recommended because the calculation involves
using weights from both the Diary and Survey instruments.Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.