behindbarstools
is an R package with the set of data tools used by the UCLA Law COVID-19 Behind Bars Project – a data project that collects and reports facility-level data on COVID-19 in jails, prisons, and other carceral facilities. behindbarstools
includes a variety of functions to help pull, clean, wrangle, and visualize our data.
Warning: This package is actively under development.
# Install directly from GitHub devtools::install_github("uclalawcovid19behindbars/behindbarstools")
The read_scrape_data()
function can be used to load our data.
behindbarstools
also includes functions to more easily load related data from other organizations including the Vera Institute's Jail Population Data through read_vera_pop()
and the Department of Homeland Security's Homeland Infrastructure Foundation-Level Data through read_hifld_data()
.
library(behindbarstools) # Pull latest data latest_scraped <- read_scrape_data() # Pull historical scraped data for California scraped_CA <- read_scrape_data(all_dates = TRUE, state = "California")
The majority of the functions in behindbarstools
help standardize our ETL and data cleaning process. This includes functions to help with the following:
clean_fac_col_txt()
, clean_facility_name()
coalesce_with_warnings()
, group_by_coalesce()
is_valid_state()
, is_federal()
ExtractTable()
, get_src_by_attr()
See our package documentation for more information and examples for each function.
behindbarstools
also includes functions to create data visualizations. This includes a custom ggplot2
theme called theme_behindbars()
that incorporates our team's style guide. All plotting functions return ggplot
objects, making it easy to customize and add additional layers.
# Plot cumulative COVID-19 cases in the Los Angeles Jails over the past 30 days plot_fac_trend(fac_name = "Los Angeles Jails", state = "California", metric = "Residents.Confirmed", plot_days = 30, auto_label = TRUE) + theme_behindbars(base_size = 14) + ggplot2::ylim(3500, 4000) + ggplot2::theme(legend.position = "none")
# Plot the 3 facilities with the largest recent spikes in active COVID-19 cases plot_recent_fac_increases(metric = "Residents.Active", plot_days = 60, num_fac = 3, auto_label = TRUE) + theme_behindbars(base_size = 14) + ggplot2::theme(axis.text.x = ggplot2::element_text(angle = 45, hjust = 1), plot.tag.position = c(0.80, 0.05))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.