In RobinHankin/hyper2: The Hyperdirichlet Distribution, Mark 2

knitr::include_graphics(system.file("help/figures/hyper2.png", package = "hyper2"))

To cite the hyper2 package in publications, please use @hankin2017_rmd. This script creates and analyses hyper2 object T20 which is a likelihood function for the strengths of the competitors in the Indian Premier League. Dataframe T20_table has one row for each T20 IPL match in the period 2008-2017 with the exception of three no-result matches and seven tied matches, which were removed.

library("hyper2",quietly=TRUE)
library("magrittr",quietly=TRUE)
T20_table <- read.table("T20.txt",header=TRUE)
head(T20_table)
nrow(T20_table)

Object T20 is a likelihood function for the strengths of the 13 teams, and T20_toss is a likelihood function that also includes a toss strength term. Some details are given in the package help file T20.Rd (type ?T20 at the R prompt).

One-way univariate analysis

What is the probability of a team winning the match, given that they won the toss?

x <- table(T20_table$toss_winner==T20_table$match_winner)
x
binom.test(x,alternative="less")$p.value

Binomial test against a null of $p=0.5$ is thus not significant: there is no evidence that winning the toss increases one's probability of winning the match. Now we consider the decision (to bat first or to field first) made by the toss winner:

table(T20_table$toss_decision)

a clear majority elect to field first. We can now ask what the probability of the batting team winning the match is:

attach(T20_table)
bat_first_wins <- 
((toss_winner==match_winner)&(toss_decision=='bat'  )) | 
((toss_winner!=match_winner)&(toss_decision=='field'))
sum(bat_first_wins)
field_first_wins <- 
((toss_winner==match_winner)&(toss_decision=='field'  )) | 
((toss_winner!=match_winner)&(toss_decision=='bat'))
sum(field_first_wins)
detach(T20_table)

as a consistency check we have

sum(field_first_wins) + sum(bat_first_wins)
nrow(T20_table)

agreeing as expected.

Two-way contingency tables

We can do some contingency tables:

M <- table(
decision           = T20_table$toss_decision,
won_toss_won_match = T20_table$toss_winner==T20_table$match_winner)
dimnames(M) <- list(
decision=c('bat first','field first'),
won_match =c(FALSE,TRUE))
M

We may reject homogeneity of proportion (Fisher's exact test, $p= r round(fisher.test(M)$p.value,3)$). Recalling that a slight majority of toss winners elect to field first (361 out of 633), we see that this is a consistent result.

Strengths of the individual teams

We can now consider the matches as Bernoulli trials and try to infer the teams' strengths. The first step is to calculate a likelihood function for the strengths:

attach(lapply(T20_table[, ], as.vector))  # vector access to team1, etc
T20 <- hyper2()
for(i in seq_along(team1)){
   T20[match_winner[i]] %<>% inc
   T20[c(team1[i],team2[i])] %<>% dec

}              
T20

and maximization is straightforward:

T20_maxp <- maxp(T20)
dotchart(T20_maxp)

So we can test the null that all the teams have the same strength:

a1 <- maxp(T20,give=TRUE)$value        # likelihood at the evaluate
a2 <- loglik(indep(equalp(T20)),T20)  # likelihood at p1=p2=...=p14
a1-a2

and the above would correspond to a support $\Lambda$ of about 14.5, or a likelihood ratio of about $e^\Lambda\simeq 2\times 10^6$. This is not significant by Edwards's criterion of two units of support per degree of freedom (we have 13df here). Alternatively, we might observe that the statistic $2\Lambda$ is in the tail region of its asymptotic distribution $\chi^2_{13}$ given by Wilks, with a p-value of

pchisq(2*14.5,df=13,lower.tail=FALSE)

showing a significant result although frankly Edwards's criterion is somewhat flaky in this context. Wilks's theorem is only asymptotic and it is not clear that the actual sampling distribution is close to its asymptotic limit of $\chi^2$. However, it turns out that the asymptotic approximation is good in this particular case: inst/T20_analysis.Rmd presents some analysis (it takes a very long time to run which is why it is not included in this vignette).

The effect of the toss

We can now attempt to see if the toss makes a difference:

T20_toss  <- hyper2(pnames=c('toss',levels(T20_table$team1)))
for(i in seq_along(team1)){
      players <- c(team1[i],team2[i])
      if(toss_winner[i] == match_winner[i]){ # win toss, win match
         T20_toss[c('toss',match_winner[i])] %<>% inc
      } else { # win toss, lose match
         T20_toss[match_winner[i]] %<>% inc
      }          
   T20_toss[c('toss',players)] %<>% dec
   T20_toss
}

and again we can maximize the likelihood:

T20_maxp_toss <- maxp(T20_toss)
dotchart(T20_maxp_toss)

Above, note the small estimated strength of the toss.

Package dataset {-}

The following lines create T20.rda, residing in the data/ directory of the package.

save(T20_table,T20,T20_maxp,T20_toss,file="T20.rda")

References {-}

RobinHankin/hyper2 documentation built on April 13, 2025, 9:33 a.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

RobinHankin/hyper2
The Hyperdirichlet Distribution, Mark 2

In RobinHankin/hyper2: The Hyperdirichlet Distribution, Mark 2

One-way univariate analysis

Two-way contingency tables

Strengths of the individual teams

The effect of the toss

Package dataset {-}

References {-}

R Package Documentation

Browse R Packages

We want your feedback!

RobinHankin/hyper2 The Hyperdirichlet Distribution, Mark 2

In RobinHankin/hyper2: The Hyperdirichlet Distribution, Mark 2

One-way univariate analysis

Two-way contingency tables

Strengths of the individual teams

The effect of the toss

Package dataset {-}

References {-}

R Package Documentation

Browse R Packages

We want your feedback!

RobinHankin/hyper2
The Hyperdirichlet Distribution, Mark 2