scrape_game: Individual Game Play-By-Play Scraper

View source: R/all_functions.R

scrape_gameR Documentation

Individual Game Play-By-Play Scraper

Description

This function retrieves and cleans play-by-play information for an individual game. Warns users of potential errors and mistakes made by the game trackers. The number of player discrepancies warning counts displays the number of events players committed when it is found they were not on the court at the time of the event. The substitution mistake warning indicates an unclean substitution was entered. (ex. 2 players enter and 1 leaves)

Usage

scrape_game(
  game_id,
  save_file = F,
  use_file = F,
  base_path = NA,
  overwrite = F
)

Arguments

game_id

string made up of digits given to each unique game. This can be found in the play-by-play url for each game.

save_file

Boolean. If true, save html for game to local file. File path constructed from 'base_path'

use_file

Boolean. If true, retrieve from local storage rather than url. File path constructed from 'base_path' directory

base_path

String. Specify base directory of html file save, ex. "/Users/jake/html_files/"

overwrite

Boolean. If true, save file will overwrite an existing file at same path. Otherwise will read from existing file (if use_file=T)

Value

data frame containing play-by-play data for a game, where each row represents an individual event from the game.

  • ID - Numeric game id that is given for each unique game

  • Date - Game date

  • Home - Home team name

  • Away - Away team name

  • Time - String for game time in format reported originally by NCAA

  • Game_Time - String reporting game time elapsed. For example, a normal game starts at 00:00 and ends at 40:00

  • Game_Seconds - Number of seconds elapsed in the game

  • Half_Status - Number indicating which half: 1,2 for regulation, 3+ for OTs

  • Home_Score - Score for home team after event occurred as reported by NCAA

  • Away_Score - Score for away team after event occurred as reported by NCAA

  • Event_Team - Which team was repsonsibile for the play-by-play entry

  • Event_Description - The text description of the event on the NCAA site

  • Player_1 - The primary player responsible for the event

  • Player_2 - The secondary player within an event. As of now this is only the assister on a made shot.

  • Event_Type - String representing the event that occurred as reported by the description

  • Event_Result - Takes values of made/missed for shot attempts, otherwise NA

  • Shot_Value - Numeric value of points awarded by shot type. Ranges from 1-3 for shot attempts, otherwise NA

  • Event_Length - Estimate of time before events calculated from event time - last previous event time

  • Home.1 - One of the players on the court for the home team

  • Home.2 - One of the players on the court for the home team

  • Home.3 - One of the players on the court for the home team

  • Home.4 - One of the players on the court for the home team

  • Home.5 - One of the players on the court for the home team

  • Away.1 - One of the players on the court for the away team

  • Away.2 - One of the players on the court for the away team

  • Away.3 - One of the players on the court for the away team

  • Away.4 - One of the players on the court for the away team

  • Away.5 - One of the players on the court for the away team

  • Status - This reports the cleanliness of the data frame. Will be CLEAN if no errors are found. Otherwise will say the number of errors that occurred or a potential substitution mistake occurred.

Examples

scrape_game(4674164)

jflancer/bigballR documentation built on March 1, 2025, 3:57 a.m.