The weighted hyperdirichlet distribution: the `hyper3` class

set.seed(1)
knitr::opts_chunk$set(echo = TRUE)
library("hyper2")
knitr::include_graphics(system.file("help/figures/hyper2.png", package = "hyper2"))

To cite the hyper2 package in publications, please use @hankin2017_rmd. Objects of class hyper3 are a generalization of hyper2 objects that allow the brackets to contain weighted probabilities. Likelihood functions are defined on non-negative numbers $p_1,\ldots, p_n$ subject to the unit-sum constraint $\sum p_i=1$. Given known weights $w^i_j$ with $1\leq i\leq j$ we have

$$\mathcal{L}\left(p_1,\ldots, p_n\right)=\prod_j\left(\sum_{i=1}^n w^i_jp_i\right)^{n_j}.$$

As a motivating example, suppose two teams (players) with Bradley-Terry strengths $p_1,p_2$ play football where we quantify the home-ground advantage with a term $\lambda$. If $p_1$ plays at home $a+b$ times with $a$ wins and $b$ losses, and plays away [so $p_2$ is at home] $c+d$ times with $c$ wins and $d$ losses, then a sensible likelihood function might be

$$\mathcal{L}(p_1,p_2,\lambda;a,b,c,d)= \left(\frac{\lambda p_1}{\lambda p_1 + p_2}\right)^{a} \left(\frac{p_2 }{\lambda p_1 + p_2}\right)^{b} \left(\frac{p_1 }{p_1 + \lambda p_2}\right)^{c} \left(\frac{\lambda p_2}{p_1 + \lambda p_2}\right)^{d}. $$

where we understand that $p_1+p_2=1$, $p_1,p_2,\lambda\geqslant 0$. Elementary techniques allow us to identify a maximum and we find

$$ \hat{p_1}=\frac{\sqrt{ac}}{\sqrt{ac}+\sqrt{bd}}\qquad \hat{p_2}=\frac{\sqrt{bd}}{\sqrt{ac}+\sqrt{bd}}\qquad \hat{\lambda}=\sqrt{\frac{ad}{bc}} $$ Likelihoods of this form are easily specified using package idiom.

If $a=6,b=2,c=7,d=1$ and $\lambda=1.3$ [assumed for the moment to be known], appropriate package idiom might be:

library("hyper2",quietly=TRUE)
A <- 6 ; B <- 2 ; C <- 7 ; D <- 1 # upper-case to avoid conflicts
H <- hyper3()
H[c(p1=1.3)]      %<>% inc(A) 
H[c(p2=1)]        %<>% inc(B) 
H[c(p1=1.3,p2=1)] %<>% dec(A+B)
H[c(p1=1)]        %<>% inc(C) 
H[c(p2=1.3)]      %<>% inc(D) 
H[c(p1=1,p2=1.3)] %<>% dec(C+D)
H

Bespoke function home_away3() achieves the same result with a more convenient input:

M <- matrix(c(NA, 6+2i ,1+7i, NA),2,2,byrow=TRUE)
teams <- c("p1","p2")
dimnames(M) <- list("@home" = teams,"@away"=teams)
dimnames(M) <- list("@home" = teams,"@away"=teams)
print(M)
home_away3(M,lambda=1.3)

Above, object M is a $2\times 2$ complex matrix, with real parts being home wins and imaginary parts being away wins. See how home_away3(M) returns the same likelihood function as the explicit calculation performed by hand previously. Now, we can calculate maximum likelihood estimates $\hat{p_1},\hat{p_2}$ of $p_1$ and $p_2$ using numerical optimization techniques, maxp() in package idiom:

maxp(H)

Further, we can test whether the players are in fact of equal strength:

equalp.test(H)

Showing convincingly that we may reject the null that $p_1=p_2$ and assert that the players do in fact differ in strength.

Another motivating example might come from Plackett-Luce likelihood functions for order statistics. Suppose we have three players with nonnegative strengths $p_\mathrm{M},p_\mathrm{RB},p_\mathrm{F}$, with unit sum. These players might be conceptualised as F1 constructors (here Mercedes (M), Red Bull (RB), and Ferrari (F) for the sake of argument). Each constructor fields two cars and the order statistic might be

\newcommand{\pr}{p_\mathrm{RB}} \newcommand{\pm}{p_\mathrm{M}} \newcommand{\pf}{p_\mathrm{F}}

$$ RB\succ M\succ F\succ F\succ RB\succ M$$

indicating that the finishing order was Red Bull first, Mercedes second, Ferrari third, and so on (this was in fact observed at the 2021 Emilia Romagna Grand Prix). The Placket-Luce likelihood function would be

$$ \frac{\pr}{\pr+\pm+\pf+\pf+\pr+\pm}\cdot \frac{\pm}{ \pm+\pf+\pf+\pr+\pm}\cdot \frac{\pf}{ \pf+\pf+\pr+\pm}\cdot \frac{\pf}{ \pf+\pr+\pm}\cdot \frac{\pr}{ \pr+\pm}\cdot \frac{\pm}{ \pm} $$

or, in a form more suitable for the package,

$$ \frac{\pr}{2\pr + 2\pm + 2\pf}\cdot \frac{\pm}{ \pr + 2\pm + 2\pf}\cdot \frac{\pf}{ \pr + \pm + 2\pf}\cdot \frac{\pf}{ \pr + \pm + \pf}\cdot \frac{\pr}{ \pr + \pm }$$

We could proceed as follows:

a <- hyper3()
a[c(R=1)] %<>% inc
a[c(R=2,M=2,F=2)] %<>% dec
a[c(M=1)] %<>% inc
a[c(R=1,M=2,F=2)] %<>% dec
a[c(F=1)] %<>% inc
a[c(R=1,M=1,F=2)] %<>% dec
a[c(F=1)] %<>% inc
a[c(R=1,M=1,F=1)] %<>% dec
a[c(R=1)] %<>% inc
a[c(R=1,M=1    )] %<>% dec
a

Or maybe more logically

b <- hyper3()
b["R"] %<>% inc ; b[c("R","M","F","F","R","M")] %<>% dec
b["M"] %<>% inc ; b[c(    "M","F","F","R","M")] %<>% dec
b["F"] %<>% inc ; b[c(        "F","F","R","M")] %<>% dec
b["F"] %<>% inc ; b[c(            "F","R","M")] %<>% dec
b["R"] %<>% inc ; b[c(                "R","M")] %<>% dec

a==b

However, bespoke function ordervec2supp3() performs all this in one step:

ordervec2supp3(c("RB","M","F","F","RB","M"))

And then

ma <- maxp(a)
ma

Evidence for different strengths?

equalp.test(a)

No evidence agains the null of equal strengths. Further:

gradient(a,ma)  # should be small at the evaluate

References {-}



Try the hyper2 package in your browser

Any scripts or data that you put into this service are public.

hyper2 documentation built on June 22, 2024, 9:57 a.m.