knitr::opts_chunk$set(echo = TRUE)
getcdw is a tool to make it easier to query the CADS
data warehouse from R.
You can install getcdw from github:
devtools::install_github("tarakc02/getcdw")
You can type a query:
# dplyr just prints the returned data.frames prettier library(dplyr) library(getcdw) get_cdw("select report_name from cdw.d_entity_mv where entity_id = 640993")
You can also pass a query from a file, for instance: get_cdw("sql/filename.sql")
.
Look for tables with find_tables()
:
# want the committees table, but don't remember the exact name: find_tables("committee") # or view all of the d_bio tables: find_tables("d_bio") # all of the giving summary tables are like sf_something_summary_mv: find_tables("^sf_.+_summary_mv$")
Or look for specific columns, either throughout the database or within a given table:
# what's the name of the household_id column in d_entity? find_columns("household", table_name = "d_entity_mv") # where are all of the places degree information appears? find_columns("degree")
You can search through TMS tables to find codes:
# where/how do we record peace_corps participation? find_codes("peace corps") # interest in neuroscience may present itself in terms of a major/minor, # an explicit interest code, attendance at an event, etc. let's look for # all possibilities: find_codes("neuroscience") # i just want to look for neuro-related event codes find_codes("neuro", "^activity$")
If you use dplyr, you can now query the data warehouse directly, rather than writing SQL and pulling data into a local data frame:
# you have to have dplyr loaded to use dplyr library(dplyr) # use src_oracle("DSNNAME") to connect to dsn DSNNAME # for example: src_oracle("CDW2") or src_oracle("URELUAT") cdw <- src_oracle() # each table you reference is described using the dplyr's tbl() function entity <- tbl(cdw, "cdw.d_entity_mv") transactions <- tbl(cdw, "cdw.f_transaction_detail_mv") # note that auto-complete works inside the dplyr verbs! entity <- entity %>% select(entity_id, person_or_org) entity # the top donors of 2001, along with whether they are people or organizations transactions %>% filter(between(giving_record_dt, to_date('01/01/2001', 'mm/dd/yyyy'), to_date('12/31/2001', 'mm/dd/yyyy'))) %>% filter(pledged_basis_flg == "Y") %>% group_by(donor_entity_id_nbr) %>% summarise(giving = sum(benefit_dept_credited_amt)) %>% inner_join(entity, by = c("donor_entity_id_nbr" = "entity_id")) %>% arrange(desc(giving)) # dplyr will not actually pull the data into memory, which makes things # pretty fast. when you're ready to actually download a data frame, run collect(): transactions %>% filter(between(giving_record_dt, to_date('01/01/2001', 'mm/dd/yyyy'), to_date('12/31/2001', 'mm/dd/yyyy'))) %>% filter(pledged_basis_flg == "Y") %>% group_by(donor_entity_id_nbr) %>% summarise(giving = sum(benefit_dept_credited_amt)) %>% filter(giving > 5000000) %>% collect
The parameterize_template
function allows you to turn templates that have ##highlighted## variables into functions:
f <- parameterize_template("my name is ##name##") f("tarak") # need named arguments when more than 1 parameter: g <-parameterize_template("hi i'm ##name##, i am ##age## years old") g(name = "tarak", age = 36)
parameterize_template
is useful in creating reports because it allows you to write SQL with parameters, with the added benefit of the convenient function interface.
template <- " select donor_entity_id_nbr, sum(benefit_dept_credited_amt) as giving from cdw.f_transaction_detail_mv where pledged_basis_flg = 'Y' and giving_record_dt between to_date('##from_date##', 'mm/dd/yyyy') and to_date('##to_date##', 'mm/dd/yyyy') group by donor_entity_id_nbr having sum(benefit_dept_credited_amt) > 5000000" top_donors_between <- parameterize_template(template) report <- top_donors_between(from_date = "01/01/2001", to_date = "12/31/2001") get_cdw(report)
Just like with get_cdw
, parameterize_template
works with files also, for example parameterize_template('sql/my-template.sql')
.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.