knitr::opts_chunk$set(
  message=FALSE, 
  warning=FALSE,
  collapse = TRUE,
  fig.height=6, 
  fig.width=6, 
  comment = "#>"
)

The covid19sf_geo provides information about San Francisco Covid19 cases distribution by geospatial location. Also, testing locations across San Francisco available on covid19sf_test_loc dataset. The following vignette provides examples for geospatial visualization of those datasets. Both datasets are sf objects and contain geometric information (i.e., ready to plot on a map).

Note: This is a non-CRAN vignette, and the following libraries required to build the plots on this document:

require(ggplot2)
require(mapview)
library(leafsync)
require(sf)
require(tmap)
require(RColorBrewer)

The covid19sf_geo dataset

The covid19sf_geo dataset provides a snapshot of the distribution of the Covid19 cases in San Francisco by different geographic locations splits of the city. The dataset contains the following fields:

While the first three geographical split methods contain geometry components that enable us to plot them as a map, the last is just an aggregated summary of the city's total cases.

library(covid19sf)

data(covid19sf_geo)

class(covid19sf_geo)

head(covid19sf_geo)

Plotting cases with mapview

The most intuitive method for plotting sf objects is with the mapview package, which is a wrapper for the leaflet JavaScript package. The main advantage of the mapview package that it is both interactive and smoothly works with sf objects. The following example demonstrated the use case of the mapview function to plot the confirmed cases in San Francisco with the plot function to plot cases distribution by ZIP code:

library(dplyr)

covid19sf_geo %>% 
  filter(area_type == "ZCTA") %>% 
  mapview(zcol = "count")

You can use at and col.regions arguments to define color buckets and color range, respectively:

covid19sf_geo %>% 
  filter(area_type == "ZCTA") %>% 
  mapview(zcol = "count", 
          at = c(0,  200, 400, 800, 1200, 1600, 2000),
          col.regions = (c('#fef0d9','#b30000')))

Plotting vaccine data with tmap

The tmap package provides functions and tools for creating thematic maps. The package supports sf objects and follow the ggplot2 framework. In the example below we will plot the COVID-19 vaccine data by geographic using the covid19sf_vaccine_geo dataset:

data(covid19sf_vaccine_geo)

head(covid19sf_vaccine_geo)

Additional setting: Following changes in the default options of the sf package from version 1.0-1 by default option is to use s2 spherical geometry as default when coordinates are ellipsoidal. That cause some issues with the tmap package, therefore we will set this functionality as FALSE:

sf_use_s2(FALSE)

We will use the percent_pop_series_completed variable to plot the percentage of population that finished their vaccination process. Let's first filter the data and transform the percent_pop_series_completed from decimal to percentage:

df <- covid19sf_vaccine_geo %>% filter(area_type == "Analysis Neighborhood") %>%
  dplyr::mutate(perc_complated = percent_pop_series_completed * 100)

Now we can plot the new object:

tm_shape(df) + 
  tm_polygons("perc_complated", 
              title = "% Group")

By default, the tm_polygons function bucket the numeric variable, in this case percentage of vaccinated population, into buckets. You can control the number of buckets using the n argument. Let's now start to customize the plot and modify to color palette:

tm_shape(df) + 
  tm_polygons("perc_complated", 
              title = "% Group",
              palette = "RdYlBu")

We can customize the plot background with the tm_style function by using the style argument:

tm_shape(df) + 
  tm_polygons("perc_complated", 
              title = "% Group",
              palette = "RdYlBu") +
  tm_style(style = "cobalt")

Last but not least, let's add title and labels for the geographic locations with the tm_text and tm_layout functions:

tm_shape(df) + 
  tm_polygons("perc_complated", 
              title = "% Group",
              palette = "RdYlBu") +
  tm_style("cobalt") + 
  tm_text("id", size = 0.5) +
  tm_layout(
    legend.position=c("right", "bottom"),
    legend.outside = FALSE,
    legend.width = 1,
    legend.title.size = 1.2,
    legend.text.size = 1,
    # legend.outside.size = 0.9,
    title= paste("COVID-19 Vaccines Given", 
                 "to San Franciscans by Geography",
                 sep = " "), 
    title.position = c("left", "top") ,
    inner.margins = c(0.01, .01, .12, .25)) 

Plotting cases with base plot

The sf package provides a plot method for sf objects (see ?sf:::plot.sf for more information). Similarly to the previews examples above, we will replot the confirmed cases by ZIP code with the plot function:

zip <- covid19sf_geo %>% 
  dplyr::filter(area_type == "ZCTA") %>% 
  dplyr::select(count, geometry) %>%
  plot(main = "Covid19 Cases by ZIP Code")

You can define the color palette with the pal argument and set the level of breaks of the color scale by setting the breaks argument to quantile and the number of breaks with the nbreaks argument (which should be aligned with the number of colors on the color palette):

library(RColorBrewer)
pal <- brewer.pal(9, "OrRd")

covid19sf_geo %>% 
  filter(area_type == "ZCTA") %>% 
  select(count, geometry) %>%
  plot(main = "Covid19 Cases by ZIP Code",
       breaks = "quantile", nbreaks = 9,
       pal = pal)

Plotting cases with ggplot2

Plotting sf object can be done with the ggplot2 package natively by using the geom_sf function for plotting sf objects:

library(ggplot2)

covid19sf_geo %>% 
  filter(area_type == "ZCTA") %>% 
  ggplot() + 
  geom_sf(aes(fill=count)) +
  ggtitle("Covid19 Cases by ZIP Code")

You can customize the polygon color scale by using the scale_viridis function that enables you to select different viridis color palettes. In addition, the geom_sf_label enables you to add labels for each polygon. In the next example, we will replot the count of cases by ZIP code, this time using scale_fill_viridis_b color palette and setting the id variable as the polygon title using the geom_sf_label:

covid19sf_geo %>% 
  filter(area_type == "ZCTA") %>% 
  ggplot() + 
  geom_sf(aes(fill=count)) + 
  scale_fill_viridis_b() +
  geom_sf_label(aes(label = id)) + 
  ggtitle("Covid19 Cases by ZIP Code")

Additional customization of the viridis color palettes can be done by the option argument, where the begin, and end arguments control the color hue:

covid19sf_geo %>% 
  filter(area_type == "ZCTA") %>% 
  ggplot() + 
  geom_sf(aes(fill=count)) + 
    scale_fill_viridis_b(option = "A",
                       begin = 0.2,
                       end = 0.7) + 
   theme_void() +
  ggtitle("Covid19 Cases by ZIP Code")

The covid19sf_test_loc dataset

The covid19sf_test_loc datasets provides general metadata about the Covid19 testing locations in San Francisco:

data(covid19sf_test_loc)

head(covid19sf_test_loc)

Plotting the testing locations on map is fairly similar to one of the covid19sf_geo as both are sf objects. The main distinction between the two, is that the covid19sf_test_loc provides the geometry location (e.g., latitude and longitude) as opposed to a polygon. Let's plot the locations with the mapview package setting the location color by the type (private or public):

covid19sf_test_loc %>% mapview(zcol = "location_type")

Combine cases dist. and testing points

The sync function from the leafsync package enables to combine multiple maps plots. In the following example, we will put side by side the cases split by ZIP code and the testing point in the city map:

m1 <- covid19sf_geo %>% 
  filter(area_type == "ZCTA") %>% 
  mapview(zcol = "count", 
          at = c(0,  200, 400, 800, 1200, 1600, 2000),
          col.regions = (c('#fef0d9','#b30000')))
m2 <- covid19sf_test_loc %>% mapview(zcol = "location_type")
sync(m1, m2)


RamiKrispin/covid19sf documentation built on April 3, 2023, 4:11 p.m.