In Swaha294/ipl: Data and functions for analysing IPL matches played in 2008-2020

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

library(ipl)
library(dplyr)
library(ggplot2)
library(purrr)

The ipl package aims at supplying data and functions for analysis of data on the Indian Premier League (IPL). It includes datasets on individual balls played in IPL games between 2008 and 2020, on the games themselves, and on the top 100 IPL batsmen and the top 100 IPL bowlers across time. Furthermore, the package includes simple functions on analysis of individual batsmen and bowlers, and for combined analysis of IPL matches.

Motivation

Cricket is a globally played sport, which extensively uses statistics for the analysis of games and individual players. Moreover, the IPL is a game that brings players from across the globe together, thus providing a unique perspective on how different cricketers play together. However, data providing information on these games and players, or functions which allow for statistical analyses of the same are not widely accessible. Right now, there is no consolidated place for all this data on the IPL. Thus, we wanted to create a package that makes this analysis of cricketers easier, since users do not have to worry about data wrangling, missing data, or important cricket statistics, and can, instead, begin their analyses on IPL.

Who should use this package?

Anyone interested in cricket, and specifically the Indian Premier League, who wants to use actual data and functions to think critically about the game, conduct analyses, and make inferences should use this package.

Datasets Included

deliveries: Ball-by-ball data of IPL matches played in 2008-2020
teams: Winning team, overs bowled, runs made and wickets fallen for each match played by each IPL team in 2008-2020
ipl: More information on matches from 2008 to 2020
batsman_100: Information of top 100 batsmen of IPL
bowlers_100: Informayion of top 100 bowlers of IPL

Functions Included

bat_avg \~ 134,112 B
bat_max \~ 113,136 B
batsman_summary \~ 130,616 B
bowler_score \~ 81,216 B
bowler_summary \~ 90,000 B
bowling_analysis \~ 93,216 B
cents_halfcents \~ 120,688 B
fours \~ 87,072 B
overs_balls \~ 90,600 B
overs \~ 77,240 B
partnership_runs \~ 93,000 B
runs \~ 76,984 B
sixes \~ 87,072
strike_rate \~ 118,520 B
toss_choice \~ 87,016 B
wickets_taken \~ 77,888 B
winloss \~ 120,768 B

for example:

# The batting average of Virat Kohli and MS Dhoni in the years 2016 and 2017 can be computed by
bat_avg(c("V Kohli", "MS Dhoni"), 2016:2017)

# The maximum runs made by MS Dhoni in the year 2017 can be computed by
bat_max("MS Dhoni", 2017)

# The strike rate of AB de Villiers in the year 2015 can be computed by
strike_rate("AB de Villiers", 2015)

# The total distribution of  the toss choice of Royal Challengers Bangalore can be computed using
toss_choice("Royal Challengers Bangalore")

# The list of bowlers with a bowling average above 40 can be retrieved using
bowler_score(40)

# The batting summary of Virate Kohli can be accessed by
batsman_summary("V Kohli")

# The runs made by each batting partnership by Delhi Daredevils in their match against Rajasthan Royals in 2008 can be found by
partnership_runs(335984, "Delhi Daredevils")

# The bowling analysis of Harbhajan Singh can be accessed by
bowling_analysis("Harbhajan Singh")

# The summary table of the wins and losses of Kolkata Knight Riders in 2012 can be found by
winloss("Kolkata Knight Riders", 2012)

More details can be found on the documentation pages which can be called via: ?function_name

What does the data look like

The first six rows of the deliveries data set looks like

head(deliveries)

The first six rows of the teams data set looks like

head(teams)

The first six rows of the bowlers_100 data set looks like

head(bowlers_100)

What can we do with this data?

We can use this package to address the (non-exhaustive) list of questions:

Compare the batting averages of batsmen over time using a data visualization
Compare strike rate and maximum runs of batsmen using a data visualization
Run a regression to predict strike rate based on different batting statistics
The bowling analysis of different bowlers can be compared

Example 1

bat_avg(c("KL Rahul", "RG Sharma", "V Kohli"), 2013:2020) %>% 
  ggplot(aes(year, batting_avg, color = batsman)) +
  geom_line() +
  labs(
    title = "Batting average between 2013 and 2020",
    x = "Year",
    y = "Batting Average",
    color = "Batsman"
  ) +
  theme_linedraw()

Example 2

my_df <- map_df(c("MS Dhoni", "V Kohli", "AB de Villiers"), batsman_summary)

ggplot(aes(max_runs, strike_rate), data = my_df) +
  geom_point() +
  labs(
    x = "Maximum Runs",
    y = "Strike Rate",
    title = "Strike Rate and Maximum Runs"
  ) +
  theme_linedraw()

Example 3

my_df2 <- my_df[, -1]

my_lm <- lm(strike_rate ~ ., data = my_df2)

summary(my_lm)

Note that this example is not accurate since it contains a selection bias as only three players are used in the data frame. To return a regression with accurate results, data on all the batsmen should be used.

Example 4

map_df(c("Rahul Sharma", "Harbhajan Singh"), bowling_analysis)

Swaha294/ipl documentation built on May 10, 2022, 3:23 p.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

Swaha294/ipl
Data and functions for analysing IPL matches played in 2008-2020

In Swaha294/ipl: Data and functions for analysing IPL matches played in 2008-2020

Motivation

Who should use this package?

Datasets Included

Functions Included

What does the data look like

What can we do with this data?

Example 1

Example 2

Example 3

Example 4

R Package Documentation

Browse R Packages

We want your feedback!

Swaha294/ipl Data and functions for analysing IPL matches played in 2008-2020

In Swaha294/ipl: Data and functions for analysing IPL matches played in 2008-2020

Motivation

Who should use this package?

Datasets Included

Functions Included

What does the data look like

What can we do with this data?

Example 1

Example 2

Example 3

Example 4

R Package Documentation

Browse R Packages

We want your feedback!

Swaha294/ipl
Data and functions for analysing IPL matches played in 2008-2020