knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
library(carolinaKernels)
library(dplyr)
library(ggplot2)
library(kableExtra)

This vignette gives an overview of the {carolinaKernels} package and how to use the Shiny app it contains. It also explains the infrastructure used to build the project based on the {golem} package by @pkg:golem.

Building Robust Shiny Apps

The book Engineering Production-Grade Shiny Apps (@book:fay) introduces an opinionated framework to build robust Shiny Apps, allowing developers to better handle complexity and construct projects over solid foundations. The framework relies on the idea of creating a Shiny App as an R package. The package structure provides many advantages over quick and easy Shiny prototypes. Some of these advantages are adding metadata, dependencies, functions, documentation, unit tests, and easy installation and deployment. For these reasons, we decided to build our final project using the {golem} framework.

Besides designing our app as a package, we also built the app using a modular structure. The use of Shiny modules makes the code base more accessible, both by avoiding R files with thousands of lines of code and also allowing team members to work on different aspects of the app simultaneously.

Giving a thorough explanation of the {golem} infrastructure is beyond the scope of this project, but the interested reader will find great gratification when implementing the framework into their app developments process.

App UI Design

We built our user interface with one main goal in mind: simplicity. An app that is easy to use is more likely to accomplish the objectives for which it was designed. When users face a new interface, it is unlikely that they will spend time reading long documentation. Many users learn how to work with an app by trial and error, clicking on things and expecting them to work as they would in another context.

With this in mind, we built our app using multiple pages. Organizing in pages is helpful because the developer gains some control on how the user explores the app. By creating "Next" and "Back" buttons, we can force the user to make a selection in any inputs and avoid the app failing while the user interacts with it.

In addition to being simple and practical, we also want the app to be visually attractive. We want the design to catch the user's eye, while not being too crowded with features that make the user experience difficult. The book authors call this problem feature-creep and define it as the process of adding features to the app that complicate the usage and the maintenance of the product, to the point that the product may be unusable or impossible to maintain.

We also tried to include few reactive elements. An excessive amount of reactive features can make the app slow and the interface overwhelming for the user. The page structure is also helpful in this regard, as including buttons to navigate avoids rendering plots with no reason if the user makes a mistake in the inputs.

Raw Data Sample

Here, is a sample of the raw data files used in the application for summary statistics and visualizations. See help function for each of the package data objects for more information.

carolinaKernels::PFFScoutingData

# display nicely formatted table with files used for analysis
carolinaKernels::PFFScoutingData %>%
  head(., 4) %>%
  kbl() %>%
  kable_material(c("striped", "hover"))

carolinaKernels::PLAYERS

carolinaKernels::PLAYERS %>%
  head(., 4) %>%
  kbl() %>%
  kable_material(c("striped", "hover"))

carolinaKernels::PLAYS

carolinaKernels::PLAYS %>%
  head(., 4) %>%
  kbl() %>%
  kable_material(c("striped", "hover"))

carolinaKernels::TRACKING

carolinaKernels::TRACKING %>%
  head(., 4) %>%
  kbl() %>%
  kable_material(c("striped", "hover"))

Animated Graphs

One of the unique aspects of this dataset is the location tracking. The NFL embeds microchips in the game ball and shoulder pads of every player. Every 0.1 seconds these chips record dynamic spatial information such as X/Y location, orientation, and velocity.

In order to show CarolinaKernel end users how granular the NFL's location data is a series of animated play diagrams were created. These charts show player movement over the course of a play using the (X,Y) coordinates and timestamps from the raw tracking data. Using the play metadata information, one play for each special teams play category and play outcome were selected.

A user defined function renders static png files for each of the timestamps and the gganimate library stiches the individual files into a final gif for application embedding.

Here is an example of one of the plots stored as a png file:

sample_gameId = "2020092012"
sample_playId = "1357"
sample_play = carolinaKernels::TRACKING %>% filter(gameId == sample_gameId & playId == sample_playId)

# add a group (is_football to denote if tracking ) 
sample_play = sample_play %>% 
  mutate(is_football = case_when(displayName == "football" ~ TRUE, TRUE ~ FALSE))

## sample animation plot
sample_play %>% filter(frameId == 1) %>% 
  ggplot() + theme(legend.position = "none", panel.grid.major = element_blank(), panel.grid.minor = element_blank(), axis.text.x=element_blank(), axis.ticks.x=element_blank(), axis.text.y=element_blank(), axis.ticks.y=element_blank(), panel.background = element_rect(fill = "#c2efff")) +
  xlim(0,120) + ylim(0, 53.33) +

  ## NFL Field Specific Lines
  # add yard lines every 10 yds
  annotate("segment", x=seq(20,105,by=10), y=0, xend=seq(20,105,by=10), yend=53.33, col="black") +

  # set mid-field and endzone lines
  annotate("segment",x=c(60), y=0, xend=c(60), yend=53.33, lwd=1.3, col="white") +

  annotate("segment",x=c(0,10,110,120), y=0, xend=c(0,10,110,120), yend=53.33, lwd=1.1, col="black") +

  # add lines at locations of hash marks
  annotate("segment", x = 10, y=(29.25 + 30.25)/2, xend=110, yend=(29.25 + 30.25)/2, color="white", lwd=1.05) +
  annotate("segment", x=10, y=(24.08 + 23.08)/2, xend=110, yend=(24.08 + 23.08)/2, color="white", lwd=1.05) +

  # sideline
  annotate("segment",x=0, y=c(0, 53.33), xend=120, yend=c(0, 53.33), lwd=1.3, col="black") +
  # denote 50yd, 30yd, 10yd line with text

  annotate("text", label=50, x=60, y=10, color="black", size=5) + 
  annotate("text", label=30, x=80, y=6, color="black", size=5) + 
  annotate("text", label=10, x=100, y=2, color="black", size=5) + 
  annotate("text",label=30,x=40, y=6, color="black", size=5) + 
  annotate("text",label=10,x=20, y=2, color="black", size=5) + 
  # above labels
  annotate("text",label=50,x=60, y=44, color="black", size=5) + 
  annotate("text",label=30,x=80, y=48, color="black", size=5) + 
  annotate("text",label=10,x=100, y=51.5, color="black", size=5) + 
  annotate("text",label=30,x=40, y=48, color="black", size=5) + 
  annotate("text",label=10,x=20, y=51.5, color="black", size=5) + 

  # field goal posts
  annotate(geom = 'segment', x = 120, xend = 120, y = 26.66 + 3.08, yend = 26.66 - 3.08, color = "yellow", lwd = 1.3) +
  annotate(geom = 'segment', x = 0, xend = 0, y = 26.66 + 3.08, yend = 26.66 - 3.08, color = "yellow", lwd = 1.3) +

  ## add the points last so they sit ontop of the field markings
  scale_color_manual(values = c("#ff1212", "#8f6545", "#057d11")) +
  geom_point(aes(x = x, y = y, color = team, shape = is_football), size = 3)

To show the player position relative to important field attributes, such as the end zone and goal posts, the scatter plots were overlayed on an actual field geometry. The yellow bars on each end represent the field goal posts. The numbering represents along the top and bottom represent the yard lines and the white horizontal lines are the hash marks.

The brown triangle represents the football and the red/green circles are the players. The chart above shows the starting positions of players/football for a kickoff.

EDA Discussion

This dashboard helps analyze different plays of NFL game by producing interesting visualizations for various aspects of the game. From the functionality perspective, the dashboard filters on a particular play type right from the landing page of R shiny app. Several functions are written which takes in the selected play type from GUI along with a dataframe containing plays information to produce visualizations. A total of 9 functions are written which produce specific plots using ggplot library. These plots are generated and pre-saved in /inst/app/cache/gifs to fetched according to selected plays from GUI.

Punt/KickOff

A sample plot for each Punt and KickOff is shown below. The plotting functions plot_kick_result and plot_team_yardage takes in the plays data frame along with play type. The first plot tries to analyze if there is a relation between kick length on the play result for Punt by looking at the box-plot distribution of kick length for results. The second plot explores the median yards gained by opponent team for a kicker team in kickOff. This graph helps locate poor performaing teams for kickOff by looking at high median kick return yards for opponents for various kicking teams.

plot_kick_result(carolinaKernels::PLAYS,"Punt")
plot_team_yardage(carolinaKernels::PLAYS,"Kickoff")


josePliego/final_proj_carolina_kernels documentation built on July 19, 2022, 6:43 p.m.