In some situations, you may want to use encodefrom()
to collapse
values, that is, group unique raw values into a smaller set of clean
values / labels. For example, say you have the following data set,
which gives each state's census division number and name:
|id|state|cendiv|cendiv_name| |:-|:---:|:----:|:----------| |1|AL|6|East South Central| |2|AK|9|Pacific| |3|AZ|8|Mountain| |4|AR|7|West South Central| |5|CA|9|Pacific| |6|CO|8|Mountain| |7|CT|1|New England| |8|DE|5|South Atlantic| |10|FL|5|South Atlantic| |12|HI|9|Pacific| |14|IL|3|East North Central| |15|IN|3|East North Central| |16|IA|4|West North Central| |31|NJ|2|Middle Atlantic| |33|NY|2|Middle Atlantic|
Rather than using the nine census divisions, you would rather group states by their regions. You have the following crosswalk:
|cendiv|cenreg|cenregnm| |:----:|:----:|:-------| |1|1|Northeast| |2|1|Northeast| |3|2|Midwest| |4|2|Midwest| |5|3|South| |6|3|South| |7|3|South| |8|4|West| |9|4|West|
As long as
raw
values are unique in the crosswalkclean
and label
columns have a 1:1 matchThen you can use encodefrom()
to collapse categories as you move
from raw to clean values.
library(crosswalkr) library(dplyr) library(haven)
## data df <- tibble(id = c(1:8,10,12,14:16,31,33), state = c('AL','AK','AZ','AR','CA','CO','CT','DE','FL','HI', 'IL','IN','IA','NJ','NY'), cendiv = c(6,9,8,7,9,8,1,5,5,9,3,3,4,2,2), cendiv_name = c('East South Central','Pacific','Mountain', 'West South Central','Pacific','Mountain','New England', 'South Atlantic','South Atlantic','Pacific', 'East North Central','East North Central', 'West North Central','Middle Atlantic','Middle Atlantic')) ## crosswalk cw <- tibble(cendiv = 1:9, cenreg = c(1,1,2,2,3,3,3,4,4), cenregnm = c('Northeast','Northeast','Midwest','Midwest', 'South','South','South','West','West'))
## encode new column df <- df %>% mutate(cenreg = encodefrom(., var = cendiv, cw_file = cw, raw = cendiv, clean = cenreg, label = cenregnm))
df
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.