View source: R/scraping_rosters_euro.R
scraping_rosters_euro | R Documentation |
This function should allow us to obtain the basic information of each Euroleague/Eurocup player, including his birth date. Then, we will be able to compute the age that each player had in the date that he played each game. The websites used to collect information are https://www.euroleaguebasketball.net/euroleague/ and https://www.euroleaguebasketball.net/eurocup/.
scraping_rosters_euro(competition, pcode, year, verbose = TRUE,
r_user = "guillermo.vinue@uv.es")
competition |
String. Options are "Euroleague" and "Eurocup". |
pcode |
Code corresponding to the player's website to scrape. |
year |
Year when the season starts. 2017 refers to 2017-2018 and so on. |
verbose |
Should R report information on progress? Default TRUE. |
r_user |
Email user to identify the user when doing web scraping. This is a polite way to do web scraping and to certify that the user is working as transparently as possible with a research purpose. |
Data frame with seven columns:
CombinID: Unique ID to identify the players.
Player: Player's name.
Position: Player's position on the court.
Height: Player's height.
Date_birth: Player's birth date.
Nationality Player's nationality.
Website_player: Website.
In addition to use the email address to stay identifiable, the function also contains two headers regarding the R platform and version used.
https://www.euroleaguebasketball.net/robots.txt
there is no Crawl-delay field. However, we assume crawlers to pause between
requests for 15 seconds. This is done by adding to the function the command
Sys.sleep(15)
.
Guillermo Vinue
do_scraping_rosters
## Not run:
# Not needed to scrape every time the package is checked, built and installed.
# It takes 15 seconds.
df_bio <- scraping_rosters_euro("Euroleague", "005791", "2017", verbose = TRUE,
r_user = "guillermo.vinue@uv.es")
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.