cces_join_slim: Build CCES Data by Joining Component Pieces

View source: R/cces_join.R

cces_join_slimR Documentation

Build CCES Data by Joining Component Pieces

Description

Produces a person-level dataset with question-level dataset. This will only use the variables necessary for the model, thus its name "slim". Therefore, is model dependent.

Usage

cces_join_slim(
  ccq_df,
  ccc_df,
  cd_df,
  formula,
  coerce_to_char = TRUE,
  keep_vars = NULL,
  subset_dist = NA
)

Arguments

ccq_df

dataframe of outcomes, taken from get_cces_question. We currently assume the name for the outcome is named "response", although this can be modified with the y_named_as argument

ccc_df

dataframe of covariates, currently taken only from the cumulative common content. It should have passed ccc_std_demographics to be compatible with ACS.

cd_df

dataframe of district-level predictors, see cd_info_2018 for a sample. Currently, we join this to the rest of the data on the column called "cd" and "year".

formula

the model formula used to fit the multilevel regression model. Should be of the form y ~ x1 + x2 + (1|x3) where y is a binary variable and only categorical variables should be used in the random effects notation.

coerce_to_char

Whether to coerce the case identifier to character class, this enables the join.

keep_vars

Variables that will be kept as a cell variable, regardless of whether it is specified in a formula. Input as character vector.

subset_dist

a character for the geography of cd to subset

Examples

 ## Not run: 
  # need data/input/cces/cces_2018.rds to run this
  ccq_tcja <- get_cces_question("CC18_326", "2018", "TCJA")

  cces_join_slim(ccq_df = ccq_tcja,
                 ccc_df = filter(ccc_samp, year == 2018),
                 cd_df = cd_info_2018,
                 formula = "response ~ age + educ + (1|cd)")

  # alternative - cd and state not in the formula, but cells follow this
  cces_join_slim(ccq_df = ccq_tcja,
                 ccc_df = filter(ccc_samp, year == 2018),
                 cd_df = cd_info_2018,
                 keep_vars = c("st", "cd"),
                 formula = "response ~ age + educ")

# A tibble: 133 x 7
#        year case_id   qID   response   age educ             cd
#       <dbl> <chr>     <chr> <fct>    <dbl> <dbl+lbl>        <chr>
#     1  2018 409942960 TCJA  Support     36 4 [2-Year]       VA-11
#     2  2018 410934028 TCJA  Support     34 5 [4-Year]       UT-04
#     3  2018 410946304 TCJA  Support     62 5 [4-Year]       OK-02
#     4  2018 411717742 TCJA  Oppose      50 1 [No HS]        IL-03
#     5  2018 412022838 TCJA  Oppose      66 5 [4-Year]       OH-08
#     6  2018 412123052 TCJA  Oppose      35 3 [Some College] WA-04
#     7  2018 412161131 TCJA  Support     32 3 [Some College] IN-03
#     8  2018 412260240 TCJA  Support     66 5 [4-Year]       VA-01
#     9  2018 412274191 TCJA  Support     75 6 [Post-Grad]    OR-02
 
## End(Not run)


kuriwaki/ccesMRPprep documentation built on Oct. 26, 2024, 10:22 p.m.