knitr::opts_chunk$set(echo = TRUE, warning = FALSE, message = FALSE)
In preparation for craggy 2019, here is a very brief introduction to simple ways to visualize spatial data. This notebook will introduce some R packages and functions to visualize spatial data. The notebook will use 6 total packages, 3 of which are in the tidyverse. The three tidy functions will be used for routine data wrangling. The 3 other packages will be used to work with the spatial data. The first package is sf
and it is my personal preferred package to use when working with spatial data. The sf
package contains all of the spatial operation functions needed for routine spatial analysis, and the syntax meshes well with the tidyverse. The sf
package introduces the sf
type, and you can think of the sf
type as a spatial data frame. It is like a normal vanilla data frame but there is an additional "geometry" column that is of list column type, and that column contains all of the spatial information.
The next package we will be using is mapview
and it provides functionality to create interactive web maps. If you need to quickly visualize your spatial data mapview
is a great first tool to use.
The last package is the tigris
package and this allows us to download census spatial boundaries from the census. For this data set we will be looking at data provided at the census tract geography. The data is provided as a regular csv, so we will need a way to convert that data to a sf
object. To do this we will download the Washington census tract shapes and join them to our data. Let's begin
## the first three packages are all loaded in library(tidyverse) as well library(dplyr) library(ggplot2) library(stringr) # spatial packages library(sf) # working with spatial data library(mapview) # quick and easy geospatial visualizations library(tigris) # downloading geospatial data from the census # data package library(craggy2019)
To start I will read the evictions data into memory. The data is stored in a csv so I will use read.csv()
to read the data and assign the data frame to evictions. After reading the data into memory I will print the first five rows.
evictions <- read.csv( system.file("extdata", "evictions.csv", package = "craggy2019"), stringsAsFactors = FALSE ) head(evictions, n = 5)
The data doesn't have any spatial information other than a "GEOID" which is also referred to as a "fips code". Fips codes are a way of designating a spatial hierarchy to different geographies in the U.S. The number of digits in the fips code (most commonly seen with the column name "GEOID") corresponds to a specific census geographic specification.
The image above shows the digit designations for states, counties, tracts, and block groups. Since our data is at the census tract level we need to make sure that all of the GEOIDs have 11 digits. Since the GEOIDs are numeric, when loading data with fips codes it is common that the data will be read into memory as numeric. This is to be expected but can cause issues since the fips codes are categorical in nature. For example, the state fips code of California is "06" but the leading 0 is often dropped when loaded as a numeric. For this reason, it is usually good to convert the GEOID to character and see if all of the observations have the right amount of characters. We will do that below.
evictions <- dplyr::mutate(evictions, GEOID = as.character(GEOID)) dplyr::count(evictions, stringr::str_length(GEOID))
Now that we have confirmed that all of the GEOIDs are of the right length, it is time to get the shapes for each census tract. To do this we will use the aforementioned tigris
package. We will call the tracts()
function from the tigris package and pass in "WA" as the state. The output of this call will be a spatial data frame. This data type comes from the sp
package and I will transform this into the sf
type using the function st_as_sf()
. This is personal preference but the sf
package is newer and I find it much easier to use than sp
.
tracts <- tigris::tracts(state = "WA") tracts <- sf::st_as_sf(tracts)
Now that the census tract shapes have been assigned to the object "tracts", I will join the evictions data to the tracts data.
evictions <- dplyr::inner_join(tracts, evictions)
Since the data is a time series for each tract, we will filter the data on the most recent year in the sample. When visualizing spatial data is difficult to also include a temporal dimension. You can of course visualize change over time but for this example we will just focus on the 2016 data.
evictions_2016 <- dplyr::filter(evictions, year == "2016")
The evictions data for the entire state of Washington is included and we are mostly interested in King County. I will filter on King County for this purpose.
king_county_evictions_2016 <- dplyr::filter(evictions_2016, parent.location == "King County, Washington")
The first visualization technique we will use is ggplot. ggplot supports sf
objects and they can be plotted using the geom_sf()
function. The rest of the code below is from the standard ggplot library, I will relabel the legend and plot title. Additionally, I will use the void theme.
ggplot(data = king_county_evictions_2016) + geom_sf(aes(fill = eviction.rate)) + labs(title = "King County 2016", fill = "Eviction Rate (%)") + theme_void() + theme(plot.title = element_text(hjust = 0.5))
Next up is mapview. If you simply pass your sf
object into mapview it will create an interactive web map using openstreetmap. This will display a choropleth of every shape in the data. A choropleth is a map of different shapes, with the color representing a single feature of the data. Here, we didn't pass any columns to mapview so it just mapped all of the data with the same color. If you click on a shape, a popup will show the data associated with that shape.
mapview::mapview(king_county_evictions_2016)
If you want to map a specific column of the data you can pass in the column name quoted to the zcol
argument.
mapview(king_county_evictions_2016, zcol = "eviction.rate")
There are many different ways of visualizing spatial data but I hope this can get you started.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.