#' Draft URL Scraper
#'
#' Returns data from all players in the given draft year url
#' @param Data the url of the draft year webpage on eliteprospects.com.
#' @param Agerange 2-length vector for the age ranges desired. first number is lower-bound, second number is upper-bound. This will be applied to all players.
#' @param draft.year Boolean about whether to include the draft year of the player.
#' @param draft.pick Boolean about whether to include the draft pick used on the player.
#' @param round Boolean about whether to include the round the player was drafted in.
#' @param Agerel This determines how the exact age included in the data frame is calculated. Default is 9/15 due to this being the cutoff data for draft eligibility. For example, during the 2019-2020 season, if Agerel is 9/15, the age in the data frame will represent the players age on 9/15/2020. The offset here is so that for a player's first draft eligible year, their age will be >18.
#' @param Goalie Boolean about whether information on goalies is wanted. If true, output will be a list of data frames, with one for players, and one for goalies. Currently this doesn't do anything, as there isn't a goalie scraper function built out yet.
#' @param position Boolean about whether to include the position of the player.
#' @param shoots Boolean about whether to include the handedness of the player.
#' @param Stats vector of the wanted stats.
#' S - Season,
#' Team - Team,
#' League - League,
#' GP - Games Played,
#' G - Goals,
#' A - Assists,
#' TP - Total Points,
#' PIM - Penalties in Minutes,
#' +/- - Plus/Minus,
#' sv% - Save Percentage (for goalies),
#' GAA - Goals Against Average (for goalies)
#' @param place.birth Boolean about whether to include the birthplace of the player.
#' @param pbsep Boolean about whether the birthplace should be split into Country, State, and City. place.birth has to be true for this to matter.
#' @param country Boolean about whether to include the country the player represents (or would represent) in international tournaments. Currently just grabs the first if a player has multiple, but eliteprospects is generally good about putting the correct one first.
#' @param height Boolean about whether to include the height of the player. This is in centimeters.
#' @param weight Boolean about whether to include the weight of the player. This is in pounds.
#' @param date.birth Boolean about whether to include the date of birth of the player.
#' @param dbsep Boolean about whether the date of birth should be seperated into year, month, and day. date.birth has to be true for this to matter.
#' @param drafted.team Boolean about whether to include the team who drafted the player.
#' @param reg.playoffs Determines if regular season data, playoff data, or both will be returned for all the players. Currently only 'R' works.
#' @return data frame (or list of data frames) with data from all the players drafted in the given year.
#' @export
EP_Draft_Scraper <- function(Data, Agerange = c(17, 25), draft.year = T, draft.pick = T, round = T,
Agerel = "9/15", Goalie = F, position = T, shoots = T,
Stats = c("S", "Team", "League", "GP", "G", "A", "TP", "PIM", "+/-", "sv%", "GAA"),
place.birth = T, pbsep = T, country = T, height = T, weight = T, date.birth = T,
dbsep = T, drafted.team = T, reg.playoffs = 'R') {
links <- paste(readLines(Data), collapse = "\n") %>%
stringr::str_match_all("<a href=\"(.*?)\"") %>%
extract2(1) %>%
.[-(1:300),2] %>%
.[grep('player',.)]
goalie_spots <- xml2::read_html(Data) %>%
rvest::html_nodes("table") %>%
rvest::html_table(header = T, fill = T) %>%
extract2(2) %>%
dplyr::filter(!Team %in% paste('ROUND', c(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15)) &
Player != 'No selection was made') %>%
tidyr::separate(Player, c('Name', 'Position'), '\\(', fill = 'right') %$%
Position %>%
substr(1,1) %>%
grep("G", .)
player_links <- links[-goalie_spots]
if (Goalie) {
goalie_links <- links[goalie_spots]
}
player_template <- Ep_Ind_Scraper(player_links[1], Agerange, draft.year, draft.pick, round, draft.elig, Agerel, position,
shoots, Stats, place.birth, pbsep, country, height, weight, date.birth, dbsep, drafted.team, reg.playoffs)
player_data <- player_template %>%
filter(Season == 'F')
for(link in player_links) {
temp <- EP_Ind_Scraper(link, Agerange, draft.year, draft.pick, round, draft.elig, Agerel, position,
shoots, Stats, place.birth, pbsep, country, height, weight, date.birth, dbsep, drafted.team, reg.playoffs)
player_data <- player_data %>%
rbind(temp)
}
player_data
}
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.