Datasets of the English Premier League 1992-2018
This package is a repository of datasets relating to Football (soccer) English Football League from its inception in August 1992 through to the end of the 2017/18 season. The intention is to update it annually shortly after each season ends in May
None of the data is official and there are sure to be a few, hopefully trivial, errors. Some data e.g. transfer fees are estimates
There are nine data sets loosely structured around the idea of a relational SQL database. So no duplicated data and lots of joins required to make full use of the figures. The data has been compiled over more than 25 years so has some bad practices built-in bur these should not detract from usage unduly
Currently the package is not on CRAN
# Install from GitHub
devtools::install_github("pssguy/epldata")
# View datasets and functions
help(package="epldata")
# Load dataset
library(epldata)
data(players)
A lot of joins between tables are necessary and you may find it useful to create derived data.frames if you plan to use the data extensively. Examples are covered in the Vignette
I have included a couple of example functions within the package
In order to make full use of the data you may want to consider the following packages which epldata does not depend on
There are many others - too many to mention - which I have used on a less frequent basis
Although, the data is a basic information, the availability of so many rich packages and the quantity of data mean that a wide range of output in terms of both form and content is possible and really depends on the imagination of the developer. I have included some examples in a vignette but there are far more output examples, with code, based on derived tables on the mytinyshinys blog
It can be used as a fun way to introduce students to coding in R and producing visualizations using data related to probably the most popular world wide Sports League
Other uses might include
Here are some real-world examples of the output
Please let me know of any interesting usage of the package and I will list them here
I am not aware of any comparable non-commercial data. I was collecting certain aspects of the data including assists and goal descriptions well before any official adaptation
The engsoccerdata package authored and maintained by James Curley makes a good complement. It has a far broader scope both temporally and geographically as it provides league match results for many of the English divisions back into the 19th Century as well as the leading leagues of many other nations. it also includes Cup data. However, it does not have the depth of this package with no player or goal information
Examples of open-source datasets in other sports fields include the lahman baseball database and the deuce tennis package
I would like to thank my brother, Stuart Clark, for providing all the goal and assist data for many years Also the soccerbase web site has been a great reference source
In addition, the developers and maintainers of all the R packages I have used; pride of place going to the RStudio team and Carson Sievert for the, ropensci, plotly package
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.