knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "man/figures/README-",
  out.width = "100%"
)

library(leafpeepr)
acs_nh <- dplyr::as_tibble(leafpeepr::acs_nh)

options(tibble.print_min = 5, tibble.print_max = 5)

leafpeepr

Lifecycle: experimental License: MIT Travis build status AppVeyor Build Status

'I'm sorry, leaf peeping? Is that something we do now?' -- President Jed Bartlet, The West Wing

leafpeepr prepares data for weighting with raking packages like autumn. It creates weighting targets from census microdata. It can also recode values and collapse values into an other category in both census data and survey data.

Installation

# install.packages("remotes")
remotes::install_github("rossellhayes/leafpeepr")

Usage

Let's get an example dataset of census microdata ready to be used as targets for raking.

library(leafpeepr)
acs_nh

Recoding

leaf_recode() recodes columns using a data frame as a map.

acs_sex_codes

leaf_recode(acs_nh, acs_sex_codes)

The recoding map can also use formulas.

acs_bpl_codes

leaf_recode(acs_nh, acs_bpl_codes)

Or a combination of values and formulas.

acs_educ_codes

leaf_recode(acs_nh, acs_educ_codes)

You can recode multiple columns at once using wide or long data frames.

acs_codes
leaf_recode(acs_nh, acs_codes)

acs_codes_long
leaf_recode(acs_nh, acs_codes_long)

Creating interaction variables

leaf_interact() creates an interaction between two variables.

acs_nh_recoded <- leaf_recode(acs_nh, acs_codes) %>% 
  janitor::clean_names() #Make column names nicer to look at

leaf_interact(acs_nh_recoded, race, hispan)

leaf_interactions() creates multiple interactions at once using a list.

leaf_interactions(acs_nh_recoded, c("race", "educ"), c("sex", "age"))

leaf_interact_all() creates interactions between one variable and all other variables.

leaf_interact_all(acs_nh_recoded, sex, except = perwt)

Generating a target data frame

Once our data is recoded, leaf_peep() prepares it to be used as weighting targets in autumn::harvest()

acs_nh_interacted <- leaf_interactions(
  acs_nh_recoded, c("race", "educ"), c("sex", "age")
)

leaf_peep(acs_nh_interacted, weight_col = perwt)

Collapsing categories

options(tibble.print_min = 10, tibble.print_max = 10)

leaf_other() recategorizes levels into an other category if their proportion is below a certain cutoff.

acs_nh_targets <- leaf_peep(acs_nh_interacted, weight_col = perwt)

dplyr::arrange(acs_nh_targets, proportion)

leaf_other(acs_nh_targets, 0.01) %>% 
  dplyr::arrange(proportion)

If the other category would itself be under the cutoff proportion, the next smallest level is added to the other category. To avoid this, set inclusive = FALSE.

leaf_other(acs_nh_targets, 0.01, inclusive = FALSE) %>% 
  dplyr::arrange(proportion)

Credits

Hex sticker font is Source Code Pro by Adobe.

Image adapted from publicdomainvectors.org and Twemoji by Twitter.


Please note that leafpeepr is released with a Contributor Code of Conduct.



rossellhayes/leafpeepr documentation built on Feb. 29, 2020, 12:48 a.m.