knitr::opts_chunk$set( message=FALSE, warning=FALSE, collapse = TRUE, fig.height=6, fig.width=6, comment = "#>" )
The covid19sf_geo
provides information about San Francisco Covid19 cases distribution by geospatial location. Also, testing locations across San Francisco available on covid19sf_test_loc
dataset. The following vignette provides examples for geospatial visualization of those datasets. Both datasets are sf
objects and contain geometric information (i.e., ready to plot on a map).
Note: This is a non-CRAN vignette, and the following libraries required to build the plots on this document:
require(ggplot2) require(mapview) library(leafsync) require(sf) require(tmap) require(RColorBrewer)
The covid19sf_geo
dataset provides a snapshot of the distribution of the Covid19 cases in San Francisco by different geographic locations splits of the city. The dataset contains the following fields:
area_type
- the geograpichal split method:ZCTA
for view the data by ZIP codeAnalysis Neighborhood
for view the data by neigborhoodsCensus Tract
for view the data by census tract, andCitywide
for total cases in the cityid
- the area ID (e.g., ZIP code, neighborhood name, etc.)count
- total number of positive cases in the arearate
- cases rate per 10000 residentsdeaths
- total number of deaths in the areaacs_population
- total number of residents in the arealast_updated
- most recent update time of the datasetWhile the first three geographical split methods contain geometry components that enable us to plot them as a map, the last is just an aggregated summary of the city's total cases.
library(covid19sf) data(covid19sf_geo) class(covid19sf_geo) head(covid19sf_geo)
The most intuitive method for plotting sf
objects is with the mapview package, which is a wrapper for the leaflet JavaScript package. The main advantage of the mapview package that it is both interactive and smoothly works with sf
objects. The following example demonstrated the use case of the mapview
function to plot the confirmed cases in San Francisco with the plot
function to plot cases distribution by ZIP code:
library(dplyr) covid19sf_geo %>% filter(area_type == "ZCTA") %>% mapview(zcol = "count")
You can use at
and col.regions
arguments to define color buckets and color range, respectively:
covid19sf_geo %>% filter(area_type == "ZCTA") %>% mapview(zcol = "count", at = c(0, 200, 400, 800, 1200, 1600, 2000), col.regions = (c('#fef0d9','#b30000')))
The tmap package provides functions and tools for creating thematic maps. The package supports sf
objects and follow the ggplot2 framework. In the example below we will plot the COVID-19 vaccine data by geographic using the covid19sf_vaccine_geo
dataset:
data(covid19sf_vaccine_geo) head(covid19sf_vaccine_geo)
Additional setting: Following changes in the default options of the sf package from version 1.0-1
by default option is to use s2 spherical geometry as default when coordinates are ellipsoidal. That cause some issues with the tmap package, therefore we will set this functionality as FALSE
:
sf_use_s2(FALSE)
We will use the percent_pop_series_completed
variable to plot the percentage of population that finished their vaccination process. Let's first filter the data and transform the percent_pop_series_completed
from decimal to percentage:
df <- covid19sf_vaccine_geo %>% filter(area_type == "Analysis Neighborhood") %>% dplyr::mutate(perc_complated = percent_pop_series_completed * 100)
Now we can plot the new object:
tm_shape(df) + tm_polygons("perc_complated", title = "% Group")
By default, the tm_polygons
function bucket the numeric variable, in this case percentage of vaccinated population, into buckets. You can control the number of buckets using the n
argument. Let's now start to customize the plot and modify to color palette:
tm_shape(df) + tm_polygons("perc_complated", title = "% Group", palette = "RdYlBu")
We can customize the plot background with the tm_style
function by using the style
argument:
tm_shape(df) + tm_polygons("perc_complated", title = "% Group", palette = "RdYlBu") + tm_style(style = "cobalt")
Last but not least, let's add title and labels for the geographic locations with the tm_text
and tm_layout
functions:
tm_shape(df) + tm_polygons("perc_complated", title = "% Group", palette = "RdYlBu") + tm_style("cobalt") + tm_text("id", size = 0.5) + tm_layout( legend.position=c("right", "bottom"), legend.outside = FALSE, legend.width = 1, legend.title.size = 1.2, legend.text.size = 1, # legend.outside.size = 0.9, title= paste("COVID-19 Vaccines Given", "to San Franciscans by Geography", sep = " "), title.position = c("left", "top") , inner.margins = c(0.01, .01, .12, .25))
The sf package provides a plot
method for sf
objects (see ?sf:::plot.sf
for more information). Similarly to the previews examples above, we will replot the confirmed cases by ZIP code with the plot
function:
zip <- covid19sf_geo %>% dplyr::filter(area_type == "ZCTA") %>% dplyr::select(count, geometry) %>% plot(main = "Covid19 Cases by ZIP Code")
You can define the color palette with the pal
argument and set the level of breaks of the color scale by setting the breaks
argument to quantile
and the number of breaks with the nbreaks
argument (which should be aligned with the number of colors on the color palette):
library(RColorBrewer) pal <- brewer.pal(9, "OrRd") covid19sf_geo %>% filter(area_type == "ZCTA") %>% select(count, geometry) %>% plot(main = "Covid19 Cases by ZIP Code", breaks = "quantile", nbreaks = 9, pal = pal)
Plotting sf
object can be done with the ggplot2 package natively by using the geom_sf function for plotting sf
objects:
library(ggplot2) covid19sf_geo %>% filter(area_type == "ZCTA") %>% ggplot() + geom_sf(aes(fill=count)) + ggtitle("Covid19 Cases by ZIP Code")
You can customize the polygon color scale by using the scale_viridis function that enables you to select different viridis color palettes. In addition, the geom_sf_label enables you to add labels for each polygon. In the next example, we will replot the count of cases by ZIP code, this time using scale_fill_viridis_b
color palette and setting the id
variable as the polygon title using the geom_sf_label
:
covid19sf_geo %>% filter(area_type == "ZCTA") %>% ggplot() + geom_sf(aes(fill=count)) + scale_fill_viridis_b() + geom_sf_label(aes(label = id)) + ggtitle("Covid19 Cases by ZIP Code")
Additional customization of the viridis color palettes can be done by the option
argument, where the begin
, and end
arguments control the color hue:
covid19sf_geo %>% filter(area_type == "ZCTA") %>% ggplot() + geom_sf(aes(fill=count)) + scale_fill_viridis_b(option = "A", begin = 0.2, end = 0.7) + theme_void() + ggtitle("Covid19 Cases by ZIP Code")
The covid19sf_test_loc
datasets provides general metadata about the Covid19 testing locations in San Francisco:
data(covid19sf_test_loc) head(covid19sf_test_loc)
Plotting the testing locations on map is fairly similar to one of the covid19sf_geo
as both are sf
objects. The main distinction between the two, is that the covid19sf_test_loc
provides the geometry location (e.g., latitude and longitude) as opposed to a polygon. Let's plot the locations with the mapview package setting the location color by the type (private or public):
covid19sf_test_loc %>% mapview(zcol = "location_type")
The sync
function from the leafsync package enables to combine multiple maps plots. In the following example, we will put side by side the cases split by ZIP code and the testing point in the city map:
m1 <- covid19sf_geo %>% filter(area_type == "ZCTA") %>% mapview(zcol = "count", at = c(0, 200, 400, 800, 1200, 1600, 2000), col.regions = (c('#fef0d9','#b30000'))) m2 <- covid19sf_test_loc %>% mapview(zcol = "location_type") sync(m1, m2)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.