Introduction to `pointdexter`

Overview

The pointdexter package labels longitudinal and latitudinal coordinates located inside a polygon. This document introduces you to pointdexter's two functions:

  1. GetPolygonBoundaries(); and
  2. LabelPointsWithinPolygons().

Spatial Data Packages

pointdexter is compatible with the two packages most useful for working with spatial data: sf and sp. I'll only use the sp package for this vignette.

# load necessary packages -----
library(pointdexter) # label coordinate pairs in polygons
library(sp)          # classes and methods for spatial data
library(knitr)       # simple table generator

Built-in Data

pointdexter comes with built-in point and polygon data - entirely due to the awesome and accessible Chicago Data Portal - to help you label points in polygons.

Point Data

The coordinate pair data comes from the Chicago Public Schools (CPS) - School Profile Information, School Year (SY) 2018-2019 data set. Each coordinate pair represents a school.

# load necessary data ----
data("cps_sy1819")

# store relevant columns ----
relevant.columns <-
  c("school_id", "short_name"
    , "school_longitude", "school_latitude")

# print first few rows of data ----
kable(head(cps_sy1819[, relevant.columns])
      , caption = "Table 1. Examining CPS SY1819 school profile data")

Polygon Data

To show what pointdexter does, we'll be using two types of spatial data from the City of Chicago: the city boundary and community area polygons.

While the city boundary is helpful for generating city-wide statistics, researchers typically use the 77 Chicago community areas when creating local-level statistics.

pointdexter makes both polygons available as sf and SpatialPolygonsDataFrame objects.

# load city boundary data ----
data("city_boundary_spdf")

# load community area data ----
data("community_areas_spdf")

# visualize polygons -----
# note: clear plot space
par(mar = c(0, 0, 1, 0))

# plot city boundary
plot(city_boundary_spdf
     , main = "City of Chicago boundary"
     , col = "gray85"
     , border = "dodgerblue4")

# plot community areas
plot(community_areas_spdf
     , main = "Chicago's 77 community areas"
     , col = "gray85"
     , border = "dodgerblue4")

Step 1. Use GetPolygonBoundaries()

GetPolygonBoundaries() returns the longitudinal and latitudinal points that make up the boundary of the polygon(s). The first argument is the polygon stored in the sf or SpatialPolygonsDataFrame object.

One polygon

If my.polygon only contains one polygon, a matrix of coordinate pairs will be returned.

# create coordinate pair matrix for city of chicago boundary ----
boundary <- 
  GetPolygonBoundaries(my.polygon = city_boundary_spdf)

# print first few records ----
kable(head(boundary)
        , caption = "Table 2. boundary is a matrix of coordinate pairs"
        , col.names = c("long", "lat"))

Multiple polygons

Otherwise, a list of labeled matrices, with each matrix representing the coordinate pairs that make the boundary of each particular polygon in my.polygon.

# create list of coordinate pair matrices for each community area ----
community.area.boundaries <-
  GetPolygonBoundaries(my.polygon = community_areas_spdf
                       , labels = community_areas_spdf$community)

# print first few records for two communities ----
kable(lapply(community.area.boundaries[c("AUSTIN", "WEST ELSDON")]
             , FUN = head)
      , caption = "Table 3. Austin (left) and West Elsdon's (right) boundaries"
      , col.names = c("long", "lat"))

Step 2. Use LabelPointsWithinPolygons()

LabelPointsWithinPolygons() identifies which longitudinal and latitudinal points lie within the polygon boundaries created from GetPolygonBoundaries().

The first two arguments of LabelPointsWithinPolygons() are the longitude and latitude columns that create your coordinate pairs of interest. The final argument - polygon.boundaries is the object you created from GetPolygonBoundaries().

polygon.boundaries is a matrix

If polygon.boundaries is a coordinate pair matrix, a logical vector will be returned identifying those points which lie in the polygon.

# identify cps schools that lie in Chicago ----
cps_sy1819$in_chicago <-
  LabelPointsWithinPolygons(lng = cps_sy1819$school_longitude
                            , lat = cps_sy1819$school_latitude
                            , polygon.boundaries = boundary)

# show first few records ----
kable(head(cps_sy1819[, c(relevant.columns, "in_chicago")])
      , caption = "Table 4. A logical vector is returned when polygon.boundaries is a matrix")

polygon.boundaries is a list of matrices

Otherwise, a character vector will be returned identifying those points that lie in each polygon.

# identify the community that each cps school lies in ----
cps_sy1819$community <-
  LabelPointsWithinPolygons(lng = cps_sy1819$school_longitude
                            , lat = cps_sy1819$school_latitude
                            , polygon.boundaries = community.area.boundaries)

# show first few records ----
kable(head(cps_sy1819[, c(relevant.columns, "in_chicago", "community")])
           , caption = "Table 5. A character vector is returned when polygon.boundaries is a list of labeled matrices")

Conclusion

pointdexter finds the boundaries of whatever polygon you give so that you can identify coordinate pairs that lie within it. This is useful when wanting to generate local statistics for particular communities.

# identify the school ratings for high schools in Austin ---- 

# filter cps schools
austin.hs <-
  cps_sy1819[cps_sy1819$community == "AUSTIN" & cps_sy1819$is_high_school, ]

# arrange data by overall rating
austin.hs <- austin.hs[order(austin.hs$overall_rating), ]

# show results
kable(austin.hs[, c(relevant.columns , "overall_rating",
              "is_high_school", "community")]
      , caption = "Table 6. Austin's highest rank high school is YCCS - Scholastic Academy, SY1819"
        , row.names = FALSE)

Session Info

sessionInfo()


Try the pointdexter package in your browser

Any scripts or data that you put into this service are public.

pointdexter documentation built on May 1, 2019, 10:29 p.m.