knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.path = "README-", warning = FALSE, message = FALSE, eval = FALSE )
The npncaa
package scrapes NCAA info from https://www.sports-reference.com/.
You can install npncaa from github with:
# install.packages("devtools") devtools::install_github("nickpaul7/npcnaa")
To collect data for a given year use the code below:
library(npncaa) year <- 2019 df_bracket <- get_bracket_data(2019)
The get_bracket_data()
function will return a data frame.
library(tidyverse) glimpse(df)
Finally, you can get all data from 1985 to the present using the following code
library(tidyverse) library(npncaa) years <- c(1993:2019) df_all_years <- purrr::map_df(years, get_bracket_data)
# year <- 2019 # url <- create_team_stats_url(year)
# url_opponent_advanced <- create_team_stats_url(year, opponent = TRUE, advanced = TRUE) # url
Get team stats for a particular year.
Not all years have all options, but the function will alert you when this happens.
# year <- 1955 # url_opponent_advanced <- create_team_stats_url(year, opponent = FALSE, advanced = FALSE) # url_opponent_advanced
df_team_stats <- ncaa_team_stats(2019) df_advanced <- ncaa_team_stats(2019, advanced = TRUE) df_opponent <- ncaa_team_stats(2019, opponent = TRUE) df_advanced_opponent <- ncaa_team_stats(2019, advanced = TRUE, opponent = TRUE)
When the data does not exist, the ncaa_team_stats()
function will return an empty data frame. This facilitates pulling data over a period of years without knowing where the end is.
df_advanced_opponent <- ncaa_team_stats(1955, advanced = TRUE, opponent = TRUE)
years <- 1993:2019 df_team_stats_1993_2019 <- purrr::map_df(years, ncaa_team_stats)
df_ml <- df_all_years %>% add_team_stats(df_team_stats_1993_2019) %>% select_features() %>% select(diff)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.