README.md

ncaahoopR

ncaahoopR is an R package for working with NCAA Basketball Play-by-Play Data. It scrapes play-by-play data and returns it to the user in a tidy format, allowing the user to explore the data with assist networks, shot charts, and in-game win-probability charts.

For pre-scraped schedules, rosters, box scores, and play-by-play data, check out the ncaahoopR_data repository.

To see the lastest changes in version 1.5, view the change log here.

Installation

You can install ncaahoopR from GitHub with:

# install.packages("devtools")
devtools::install_github("lbenz730/ncaahoopR")

If you encounter installation issues, the following tips have helped a few users successfully install the package:

Functions

Several functions use ESPN game_ids. You can find the game_id in the URL for the game summary, as shown in the URL for the summary of the UMBC-Virginia game below. game_id

Scraping Data

The team parameter in the above functions must be a valid team name from the ids dataset built into the package. See the Datasets section below for more details.

Win-Probability and Game-Flow Charts

Win Probability Charts

A prior version of wp_chart used base R while gg_wp_chart used the ggplot2 plotting library. As of the 2020-21 season, both functions call the same ggplot2 library, and gg_wp_chart now simply aliases wp_chart

wp_chart(game_id, home_col, away_col, show_legend = T)

gg_wp_chart(game_id, home_col, away_col, show_labels = T)

Game Flow Charts

game_flow(game_id, home_col, away_col)

Game Excitement Index

game_exciment_index(game_id, include_spread = T)

Returns GEI (Game Excitement Index) for given ESPN game_id. For more information about how these win-probability charts are fit and how Game Excitement Index is calculated, check out the below links

Game Control Measures

average_win_prob(game_id, include_spread = T)

average_score_diff(game_id)

Assist Networks

Traditional Assist Networks

assist_net(team, season, node_col, three_weights = T, threshold = T, message = NA, return_stats = T)

Circle Assist Networks and Player Highlighting

circle_assist_net(team, season, highlight_player = NA, highlight_color = NA, three_weights = T, threshold = 0, message = NA, return_stats = T)

Shot Charts

There are currently three functions for scraping and plotting shot location data. These functions are written by Meyappan Subbaiah.

get_shot_locs(game_id): Returns data frame with shot location data when available. Note that if the extra_parse flag in get_pbp_game is set to TRUE, shot location data will already be included in the play-by-play data (if available).

game_shot_chart(game_id, heatmap = F): Plots shots for a given game.

team_shot_chart(game_ids, team, heatmap = F): Plots shots taken by team during a given set of game(s).

opp_shot_chart(game_ids, team, heatmap = F): Plots shots against a team during a given set of game(s).

Datasets

dict A data frame for converting between team names from various sites.

ids A data frame for converting between team names from various sites.

ncaa_colors A data frame of team color hex codes, pulled from teamcolorcodes.com. Additional data coverage provided by Luke Morris.

Available Colors Primary and secondary colors for all 353 teams.

These datasets can be loaded by typing data("ids"), data("ncaa_colors"), or data("dict"), respectively.

Examples

Win Probability Charts

wp wp_chart(game_id = 401082978, home_col = "gray", away_col = "orange")

wp2 wp_chart(game_id = 401168364, home_col = "#7BAFD4", away_col = "#001A57")

Game Flow Chart

game_flow game_flow(game_id = 401082669, home_col = "blue", away_col = "navy")

Single-Game Assist Network

Assist Single assist_net(team = "Oklahoma", node_col = "firebrick4", season = 400989185)

Season-Long Assist Network

Assist All assist_net(team = "Yale", node_col = "royalblue4", season = "2017-18")

Circle Assist Networks

UNC circle_assist_net(team = "UNC", season = 401082861)

Player Highlighting

Frankie Ferrari circle_assist_net(team = "San Francisco", season = "2018-19", highlight_player = "Frankie Ferrari", highlight_color = "#FDBB30")

Shot Charts

heatmap game_shot_chart(game_id = 401168364, heatmap = T)

shotchart game_shot_chart(game_id = 401168364)

Glossary

Play-by-Play files contain the following variables:

If extra_parse = TRUE in get_pbp_game, the following variables are also included.

Stand-alone shot location data frames contain the following variables.

The court is 50 feet by 94 feet, with (0,0) always being placed in the bottom left corner of the shot chart. Any full-court shot chart rendered using game_shot_chart() preserves ESPN shot locations as they are found online, while halfcourt charts using team_shot_chart() convert all shot locations to to a 50 feet by 47 feet halfcourt. The perspective on the halfcourt shot charts is as if one is standing under the hoop, looking toward the opposition hoop. (0,0) again represents the bottom left corner and (50, 47) represents the top right corner.



KdotHudash/HuddyTest documentation built on Dec. 18, 2021, 3:31 a.m.