This package is designated for all NBA enthusiasts! The
package works to scrape online tabular data from the ESPN NBA website
into a csv file. It also includes various functions to create graphs and
statistical analysis for your interest (such as boxplots, player
rankings by stats, and a summary statistics table).
An example of the ESPN NBA 2018/19 Regular season player stats can be found in the following url:
rsketball::nba_scraper is based on Selenium (or specifically
RSelenium) which enables automated web browsing through “drivers”. To
use it, please ensure that
Docker is installed.
For installation instructions, please follow the guide to Docker installation based on your OS type. Docker will be used to pull the relevant Chromedriver image that when executed as containers, will serve as the “driver” for Selenium.
The following steps are required only for the
nba_scraper function. If
you already have the scraped data file and wish to use the other
nbastats), there is no need to
proceed with these steps.
Step 1 (Command line/Terminal): Preparation Step (Docker container)
Pull docker image with the following code in Terminal. We will stick to Chrome since it seems compatible with Windows while Firefox is not.
docker pull selenium/standalone-chrome
Critical step about setting ports and memory allocation:
We need to set up the Docker container default port 4444 to our computer
host port 4445. Keep this port number as inputs for the
function. We will also allocate the virtual memory of the container to
2Gb for it to scrape effectively.
Run the following code in Terminal:
docker run -d -p 4445:4444 --shm-size 2g selenium/standalone-chrome
Verify that the docker container is in operation by running the following code in Terminal:
Step 2 (R/RStudio): Scraping with
Now that the container is running with the allocated memory and assigned port, we can proceed with testing
library(rsketball) # Scrape postseason season 2017/18 while saving to a local csv file. nba_2017_playoffs <- nba_scraper(season_year = 2017, season_type = "postseason", port=4445L, csv_path = "nba_2017_playoffs.csv")
If everything was executed as intended, you should obtain a csv file
called “nba_2017_playoffs.csv” containing the scraped data, and a
tibble in your R environment named “nba_2017_playoffs”. With the
tibble, you can use the other
rsketball functions for your analysis.
Step 3 (Command line/Terminal): Termination of Docker Container
After test scraping is completed, we can shut down the Docker Container instance. This will also ensure that your computer memory/resources are restored.
docker stop $(docker ps -q)
If you wish to, you can also remove the Docker image from your computer, where “” represents the id of your Docker image.
docker image rm <image_id>
To do testing of the package functions, please refer to the instructions found in the README.md located at the testing subdirectory folder.
rsketball package aims to further gain understanding of ESPN NBA
data and does not have a specific fit to the R ecosystem. There are
currently some other library packages such as
that take data from other sources (NBA Stats API, Basketball Insiders,
Basketball-Reference, HoopsHype, and RealGM), but no package that we
currently know of takes data from ESPN NBA specifically.
rsketball is still in project development. We estimate that by end
March 2020, one can install the released version of
Package installation in R:
And the development version from Github with:
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.