View source: R/cces_std-for-acs.R
ccc_std_demographics | R Documentation |
Recode CCES variables so that they merge to ACS variables
ccc_std_demographics(
tbl,
only_demog = FALSE,
age_key = deframe(ccesMRPprep::age5_key),
wh_as_hisp = TRUE,
bh_as_hisp = TRUE
)
tbl |
The cumulative common content. It can be any subset but must include variables
|
only_demog |
Drop variables besides demographics? Defaults to FALSE |
age_key |
The vector key to use to bin age. Can be |
wh_as_hisp |
Should people who identify as both White and Hispanic be
coded as "Hispanic", thereby leaving all remaining "Whites" as Non-Hispanic Whites
by definition? Could be |
bh_as_hisp |
Same as |
The output is of the same dimensions as the input (unless only_demog = TRUE
)
but with the following exceptions:
age
is coded to match up with the ACS bins and the recoding occurs
in a separate function, ccc_bin_age
. The unbinned age is left instead to
age_orig
.
educ
is coarsened and relabelled with 4 categories to match up with the ACS.
(the original version is left as educ_cces_chr
). Recoding is governed by
the key-value pairs educ_key.
educ_3
is further coarsened to 3 categories, grouping together a BA
and a higher degree into one category. This is necessary for some ACS tables
that do not make the distinction. Make sure to decide which type of education
variable to use beforehand after looking at the ACS codes
the same goes for race
. These recodings are governed by the
key-value pair race_key.
cd
is standardized so that at large districts are given "01" and
single-digit districts are padded with 0s. e.g. "WY-01"
and "CA-02"
.
This function requires data to have the following columns:
A string column called st
that is a two-letter abbreviation of the state, or a labelled
variable coercible to a string.
A string column called cd
that has the congressional district that is of the form
"WY-01"
, OR a numeric column called dist
that has the numeric district number.
cd_up
can also be used for the district in the upcoming election.
A <numeric+labelled> column called educ
for education, race
for race,
age
for age, and gender
for gender, with values following
the cumulative content.
library(dplyr)
ccc_std_demographics(ccc_samp)
ccc_std_demographics(ccc_samp, wh_as_hisp = FALSE) %>% count(race)
ccc_std_demographics(ccc_samp, bh_as_hisp = FALSE, wh_as_hisp = FALSE) %>% count(race)
## Not run:
# For full data (takes a while)
library(dataverse)
cumulative_rds <- get_cces_dataverse("cumulative")
cumulative_std <- ccc_std_demographics(cumulative_rds)
## End(Not run)
## Not run:
wrong_cd_fmt <- mutate(ccc_samp, cd = str_replace_all(cd, "01", "1"))
wrong_cd_fmt %>% filter(st == "HI") %>% count(cd)
# throws error because CD is formatted the wrong way
ccc_std_demographics(wrong_cd_fmt)
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.