README.md

wncaahoopR

wncaahoopR is an R package for working with women's NCAA Basketball play-by-play (and other) data.

This package relies heavily on the work done by Luke Benz (lbenz730) in his package ncaahoopR, designed for working with men's NCAA basketball play-by-play data.

wncaahoopR also scrapes data from ESPN, but differs in that it does not have scraping provided in multiple functions, choosing to only scan in the data once and then makes use of that pbp object within R to produce win-probability and game flow charts, as well as assist networks.

wncaahoopR is a joint effort between Seth Berry (saberry) and Scott Nestler (snestler). They welcome bug identification and ideas via the Issues tab, but please look at open issues before creating a new one.

Installation

You can install wncaahoopR from GitHub with:

# install.packages("devtools")
devtools::install_github("snestler/wncaahoopR")

Functions

Several functions use ESPN game_ids. You can find the game_id in the URL for the game summary, as shown below in the URL for the summary of the Notre Dame - Michigan game played on Nov. 23, 2019. game_id

Scraping Data

The team parameter in the above functions must be a valid team name from the ids dataset built into the package. See the Datasets section below for more details.

Win-Probability and Game-Flow Charts

Win Probability Charts

There is a function (wp_chart) for plotting win probability charts, using the ggplot2 library. NOTE: This is equivalent to the (gg_wp_chart) function in ncaahoopR package. We did not see the need to maintain a base R graphics function.

  1. NOTE 1: For now, all win probability charts are "naive," in that they do not incorporate a pre-game line or spread, until we determine a reliable and freely available source (since this is not provided on ESPN, like for the mens game).
  2. NOTE 2: For now, the WP calculations are based on historical data from NCAA MBB games; this will be updated in a future release.

wp_chart(pbp, home_col, away_col, show_legend = T)

Game Flow Charts

game_flow(pbp, home_col, away_col)

Game Excitement Index

game_exciment_index(pbp)

Returns GEI (Game Excitement Index) for given ESPN game_id. For more information about how these win-probability charts are fit and how Game Excitement Index is calculated, check out the below links

Game Control Measures

average_win_prob(game_id)

average_score_diff(game_id)

Assist Networks

Traditional Assist Networks

assist_net(pbp, team, node_col, three_weights, threshold, message = NA, listing = T)

Circle Assist Networks and Player Highlighting

circle_assist_net(pbp, team, season, highlight_player, highlight_color, three_weights, message = NA, listing = T)

Shot Charts

wncaahoopR does not currently include the ability to plot shot location data, as this information is not currently available on ESPN, like it is for some mens games.

Datasets

dict A dataframe for converting between team names from various sites.

ids A data frame for converting between team names from various sites.

ncaa_colors A data frame of team color hex codes, pulled from teamcolorcodes.com. Additional data coverage provided by Luke Morris.

Available Colors Primary and secondary colors for all 351 teams.

These datasets can be loaded by typing data("ids"), data("ncaa_colors"), or data("dict"), respectively.

Examples

Creating a PBP Object

ND_Mich <- w_get_pbp_game(401171130)

Win Probability Charts

wp wp_chart(ND_Mich)

wp wp_chart(ND_Mich, away_col = "#C99700")

Game Flow Chart

game_flow game_flow(ND_Mich, away_col = "#C99700")

Single-Game Assist Network

Assist Single assist_net(ND_Mich, team = "Notre Dame")

Circle Assist Networks

Oregon circle_assist_net(ND_Mich, team = "Notre Dame")

Brunelle Highlight circle_assist_net(ND_Mich, team = "Notre Dame", highlight_player = "Sam Brunelle", highlight_color = "#C99700")

Glossary

Play-by-Play files contain the following variables:



snestler/wncaahoopR documentation built on Oct. 18, 2021, 2:11 p.m.