knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.path = "plotting-tracking-data-", out.width = "100%", dpi = 600 ) options(rmarkdown.html_vignette.check_title = FALSE)
In order to run the following vignettes, you'll need the sportyR
and ggplot2
libraries loaded into your workspace.
library(sportyR) library(ggplot2)
library(sportyR) suppressPackageStartupMessages(library(ggplot2))
sportyR
seeks to make plotting geospatial tracking data as straight-forward as possible, allowing you to focus more on the analysis than on the visuals. I'll demonstrate a few examples here on how to use the package to display "static" data, or data that shows a snapshot in time
For this example, I'll be using the data provided for the Big Data Cup, which is publicly available. Specifically, the data I'll use to demonstrate how to plot comes from the data provided for the Big Data Cup - 2021.
Start by reading in the data from the CSV file provided:
# Read data from the Big Data Cup bdc_data <- data.table::fread( glue::glue( "https://raw.githubusercontent.com/bigdatacup/Big-Data-Cup-2021", "/main/hackathon_nwhl.csv" ) ) # Convert to data frame bdc_data <- as.data.frame(bdc_data)
Let's explore the dataset a bit to see what (if any) changes would be helpful
names(bdc_data)
It'd be helpful to change the names of the columns to be easier to work with, so I'll change X Coordinate
and Y Coordinate
to be x
and y
, and X Coordinate 2
and Y Coordinate 2
to be x2
and y2
.
# Change names of X Coordinate and Y Coordinate to x and y respectively names(bdc_data)[13:14] <- c("x", "y") names(bdc_data)[20:21] <- c("x2", "y2") # Preview what the data looks like knitr::kable(head(bdc_data))
I'll use the first game in the data here, which is between the Minnesota Whitecaps and the Boston Pride. To keep things easy I'll focus first on single-point data: shots.
# Subset to only be shots from the game on 2021-01-23 between the Minnesota # White Caps and Boston Pride bdc_shots <- bdc_data[(bdc_data$Event == "Shot") & (bdc_data$`Home Team` == "Minnesota Whitecaps") & (bdc_data$game_date == "2021-01-23"), ] # Separate shots by team whitecaps_shots <- bdc_shots[bdc_shots$Team == "Minnesota Whitecaps", ] pride_shots <- bdc_shots[bdc_shots$Team == "Boston Pride", ]
To keep all shots for a team on the same side of the ice, we need to adjust the coordinates of the shot. We've got to keep the shooter's perspective towards the net constant as well, so the following is an appropriate transformation.
# Correct the shot location whitecaps_shots["x"] <- 200 - whitecaps_shots["x"] whitecaps_shots["y"] <- 85 - whitecaps_shots["y"]
This positions the data correctly, so let's move on to plotting
Since this data pertains to the Premier Hockey Federation (PHF), we'll start the plotting by drawing a PHF-sized rink. We'll use x_trans
and y_trans
to align the data and plot coordinates. I encourage you to experiment with the data to see how this works in practice
# Draw the rink phf_rink <- geom_hockey("phf", x_trans = 100, y_trans = 42.5) # Display the rink here phf_rink
Now all that's left to do is to add the data points to the plot! Because of how ggplot2
is designed, this is very straightforward.
# Add the shots to the plot phf_rink + geom_point(data = whitecaps_shots, aes(x, y), color = "#2251b8") + geom_point(data = pride_shots, aes(x, y), color = "#fec52e")
Pretend instead that we want to look at where a team's passes were executed during the game. This is also very easy to do. Let's take the same dataset we had before, bdc_data
, and this time subset it to look at Boston's passes in the game. We've already got our rink plot from before, so let's just subset to the passing data, and add it to the plot:
# Subset the data to be Boston's passes boston_passes <- bdc_data[(bdc_data$Event == "Play") & (bdc_data$Team == "Boston Pride") & (bdc_data$game_date == "2021-01-23"), ] # Plot passes with geom_segment() phf_rink + geom_segment( data = boston_passes, aes( x = x, y = y, xend = x2, yend = y2 ), lineend = "round", linejoin = "round", color = "#ffcb05" )
And there you have it! This works for any geospatial data, for any sport (supported by sportyR
), and for any league (supported by sportyR
). Give it a try, and please reach out if you have ideas for improvements!
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.