knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.width = 7, fig.height = 7, warning = FALSE, message = FALSE )
The bivariateLeaflet
package provides tools for creating interactive bivariate choropleth maps using Leaflet. Bivariate choropleth maps allow you to visualize the relationship between two variables simultaneously across geographic regions.
The bivariateLeaflet
package addresses a need in geospatial data visualization by providing a tool to create interactive bivariate choropleth maps using Leaflet. These maps, which enable the simultaneous visualization of relationships between two variables across geographic regions, are powerful tools for analyzing complex datasets. Despite their potential, creating bivariate maps has historically been a challenging task, requiring both technical expertise and substantial time investment.
This package simplifies the process, making bivariate mapping more accessible to users of R. It is particularly valuable for researchers and analysts working in fields such as demography, justice, environmental studies, and public health. For instance, bivariateLeaflet
allows users to explore the relationship between income and education, analyze correlations between temperature and precipitation, or visualize connections between healthcare access and health outcomes. By integrating seamlessly with the spatial data format, sf
, and offering an intuitive interface, the package enables users to generate interactive maps that are both visually compelling and informative.
The package’s functionality includes the ability to create bivariate maps with customizable color schemes, handle data challenges such as missing values and outliers, and provide clear, interpretable legends for effective communication. Through its integration with Leaflet, it ensures that the resulting visualizations are interactive and web-ready, enabling users to share insights widely.
You can install the package from CRAN:
install.packages("bivariateLeaflet")
Or install the development version from GitHub:
# install.packages("devtools") devtools::install_github("maduprey/bivariateLeaflet")
The package works with spatial data from various sources. We'll demonstrate using census data at different geographic levels.
library(tidycensus) library(tidyr) library(dplyr) library(sf) # Get census API key if you haven't already # census_api_key("YOUR_KEY_HERE") # Get ACS data for DC census tracts tract_data <- get_acs( geography = "tract", variables = c( "B01003_001", # Total population "B19013_001" # Median household income ), state = "DC", year = 2020, geometry = TRUE ) # Pivot data to wide format tract_data_wide <- tract_data %>% select(-moe) %>% pivot_wider( names_from = variable, values_from = estimate )
# Get county-level data for entire US county_data <- get_acs( geography = "county", variables = c( "B01003_001", # Total population "B19013_001" # Median household income ), year = 2020, geometry = TRUE ) county_data_wide <- county_data %>% select(-moe) %>% pivot_wider( names_from = variable, values_from = estimate )
# Get block group data for DC blockgroup_data <- get_acs( geography = "block group", variables = c( "B01003_001", # Total population "B19013_001" # Median household income ), state = "DC", year = 2020, geometry = TRUE ) blockgroup_data_wide <- blockgroup_data %>% select(-moe) %>% pivot_wider( names_from = variable, values_from = estimate )
Let's start with creating a basic bivariate choropleth map using the DC census tract data:
library(bivariateLeaflet)
# Create basic map tract_map <- create_bivariate_map( data = tract_data_wide, var_1 = "B01003_001", # Total population var_2 = "B19013_001" # Median household income ) # Display the map tract_map
The default color scheme uses a 3x3 matrix where:
# Display the default color matrix create_default_color_matrix()
sequential_colors <- matrix(c( "#49006a", "#2d004b", "#1a0027", "#8c96c6", "#8856a7", "#810f7c", "#edf8fb", "#bfd3e6", "#9ebcda" ), nrow = 3, byrow = TRUE) sequential_map <- create_bivariate_map( data = tract_data_wide, var_1 = "B01003_001", var_2 = "B19013_001", color_matrix = sequential_colors ) sequential_map
Census data often includes missing values (NA) for various reasons. Here's how to handle them:
# Identify tracts with missing data missing_data <- tract_data_wide %>% mutate( missing_pop = is.na(B01003_001), missing_income = is.na(B19013_001) ) # Create a map excluding missing data clean_map <- create_bivariate_map( data = missing_data %>% filter(!missing_pop & !missing_income), var_1 = "B01003_001", var_2 = "B19013_001" ) clean_map
# Create a map of US counties county_map <- create_bivariate_map( data = county_data_wide, var_1 = "B01003_001", var_2 = "B19013_001" ) county_map
# Create a detailed block group map blockgroup_map <- create_bivariate_map( data = blockgroup_data_wide, var_1 = "B01003_001", var_2 = "B19013_001" ) blockgroup_map
You can create custom tooltips for your map by providing your own labels:
# Create custom labels with tract names and formatted values custom_labels <- sprintf( "<strong>Census Tract:</strong> %s<br/> <strong>Population:</strong> %s<br/> <strong>Median Income:</strong> $%s", tract_data_wide$NAME, format(tract_data_wide$B01003_001, big.mark = ","), format(tract_data_wide$B19013_001, big.mark = ",") ) # Create map with custom labels map_custom_labels <- create_bivariate_map( data = tract_data_wide, var_1 = "B01003_001", var_2 = "B19013_001", custom_labels = custom_labels ) map_custom_labels
This will create tooltips that show:
When creating bivariate choropleth maps:
Think about the story you want to tell
Data Distribution
Be aware of outliers that might affect the visualization
Color Schemes
Consider your audience when choosing colors
Scale and Geography
Be consistent with projections
Legend and Labels
The package includes various checks and warnings:
# Missing variable try(calculate_tertiles(data.frame(x = 1:5), "nonexistent", "also_nonexistent")) # Too few unique values test_data <- data.frame( var1 = c(1,1,1,2), var2 = c(1,2,3,4) ) calculate_tertiles(test_data, "var1", "var2")
# Identify outliers using IQR method outlier_data <- tract_data_wide %>% mutate( pop_outlier = B01003_001 > quantile(B01003_001, 0.75, na.rm = TRUE) + 1.5 * IQR(B01003_001, na.rm = TRUE), income_outlier = B19013_001 > quantile(B19013_001, 0.75, na.rm = TRUE) + 1.5 * IQR(B19013_001, na.rm = TRUE) ) # Create map excluding outliers no_outliers_map <- create_bivariate_map( data = outlier_data %>% filter(!pop_outlier & !income_outlier), var_1 = "B01003_001", var_2 = "B19013_001" )
# Reproject data if needed reprojected_data <- tract_data_wide %>% st_transform(4326) # WGS 84 # Create map with reprojected data projected_map <- create_bivariate_map( data = reprojected_data, var_1 = "B01003_001", var_2 = "B19013_001" )
For more information about bivariate choropleth maps:
The development of this package was funded by Grant 2020-R2-CX-0027 from the National Institute of Justice, Office of Justice Programs, U.S. Department of Justice. The opinions, findings, and conclusions or recommendations expressed are those of the authors and do not necessarily reflect those of the U.S. Department of Justice.
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.