In RobinHankin/hyper2: The Hyperdirichlet Distribution, Mark 2

knitr::include_graphics(system.file("help/figures/hyper2.png", package = "hyper2"))

To cite the hyper2 package in publications, please use @hankin2017_rmd. In this vignette, object H is created. It is identical to object table_tennis in the package; to see it type data(table_tennis) at the command line, and to see the help page, type help(table_tennis). Suppose we have three players, "A","B", and "C" who play table tennis. We seek to quantify the effect of the serve. The scoreline is as follows:

A vs B, A serves, 5-1
A vs B, B serves, 1-3
A vs C, A serves, 4-1
A vs C, C serves, 1-2

(note that B does not play C). This dataset is easily analysed with the hyper2 package although the model implicitly assumes that the serve strength does not depend on who is serving. The first step is to define a hyper2 object:

library("hyper2",quietly=TRUE)
library("magrittr")
(H <- hyper2(pnames=c("S","a","b","c")))

so the likelihood function is uniform. We have players a, b, c (players' names are not capitalised in R idiom for clarity) and a reified entity S representing the strength of the serve. Now we have to input the data, and we will go through the four scorelines above, one by one. Note that this approach implicitly assumes independence of scores.

A vs B, A serves, score 5-1: {-}

H[c("a","S")]     %<>% "+"(5  ) # a+S win 5 games
H["b"]            %<>% "+"(  1) # b wins 1 game
H[c("a","b","S")] %<>% "-"(5+1) # a+b+S play 1+5=6 games
H

The likelihood function is of the form $\frac{\left(a+S\right)^5b}{\left(a+b+S\right)^6}$. We can add the other observations to $H$:

A vs B, B serves, score 1-3:

H[c("b","S")]     %<>% "+"(3  ) # b+S win 3 games
H["a"]            %<>% "+"(  1) # a wins 1 game 
H[c("a","b","S")] %<>% "-"(3+1) # a+b+S play 1+3=4 games

A vs C, A serves, score 4-1:

H[c("a","S")]     %<>% "+"(4  ) # a+S win 4 games
H["c"]            %<>% "+"(  1) # c wins 1 game 
H[c("a","c","S")] %<>% "-"(4+1) # a+c+S play 1+4=5 games

A vs C, C serves, score 1-2:

H["a"]            %<>% "+"(1  ) # a wins 1 game
H[c("c","S")]     %<>% "+"(  2) # c+S win 2 games
H[c("a","c","S")] %<>% "-"(1+2) # a+c+S play 1+2=3 games

For the entire dataset we have

table_tennis <- H
table_tennis

Here $H$ is a likelihood function on $a,b,c,S$, algebraically proportional to

[ \frac{(S+a)^9(S+b)^3(S+c)^2a^2bc}{(S+a+b)^{10}(S+a+c)^8} ]

Maximum likelihood estimate for the strengths

We now find the unconstrained MLE:

table_tennis_maxp <- maxp(table_tennis)
table_tennis_maxp

Test hypothesis that serve strength S is zero

specificp.test(table_tennis,"S",0.01)

Thus the tail area, pval, is about 0.022 which gives us a significant result: the serve has nonzero strength.

Assessment of equal player strength.

We now assess the hypothesis that the three (real) players are of zero strength: $H_E\colon p_A=p_B=p_C$.

samep.test(table_tennis,c("a","b","c"))

Above, samep.test() complains because it is performing a one-dimensional optimization (we may ignore this issue). The result of the test is that there is no evidence to suggest a difference between players a, b, and c.

Profile likelihood for serve strength

We can give a profile likelihood for $S$ by maximizing the likelihood of $(S,p_a,p_b,p_c)$ subject to a particular value of $S$ (in addition to the unit sum constraint). The following function defines a profile likelihood function for $S$:

S_vals <- seq(from=0.02,to=0.85,len=30)  # possible values for S
plot(S_vals,profsupp(table_tennis,"S",S_vals))
abline(h=c(0,-2))

So we can see that the credible range is from about 0.07-0.8 (and in particular excludes zero).

Assessment of B vs C

The maximum likelihood estimate for strengths allows us to assess what might happen when B plays C (which was not observed). For convenience, here is the evaluate:

maxp(table_tennis)

If B serves, we would have $B+S:C\simeq 0.14+0.45:0.15\equiv 59:15$, or B wins with probability of about $0.78$. If C serves we have $B:C+S\simeq 0.14:0.16+0.45\equiv 14:61$, so B wins with a probability of about $0.19$.

Profile likelihood for B vs C

The profile likelihood technique can be applied to the question of the strengths of players B and C, even though no competition between the players was observed:

proflike <- function(bc,give=FALSE){
  B <- bc[1]
  C <- bc[2]
  if(B+C>=1){return(NA)}

  objective <- function(S){     #

    jj <- c(S,A=1-(S+B+C),B,C)  # B,C fixed
    loglik(indep(jj),H)         # no minus: we use maximum=T in optimize()
  }

  maxlike_constrained <-  # single DoF is S [B,C given, A is fillup]
    optimize(             # single DoF -> use optimize() not optim()
        f = objective,         
        interval = c(0,1-(B+C)),
        maximum = TRUE
    )

  if(give){
    return(maxlike_constrained)
  } else{
    return(maxlike_constrained$objective)
  }
} 

small <- 1e-5
n <- 50
p <- seq(from=small,to=1-small,length=n)
jj <- expand.grid(p,p)
support <- apply(jj,1,proflike)
support <- support-max(support,na.rm=TRUE)
supportx <- support
dim(supportx) <- c(n,n)

x <- jj[,1]+jj[,2]/2
y <- jj[,2]*sqrt(3)/2
plot(x,y,cex=(support+6)/12, 
   pch=16,asp=1,xlim=c(-0.1,1.1),ylim=c(-0.2,0.9),
   axes=FALSE,xlab="",ylab="",main="profile likelihood for B,C")
polygon(x=c(0,1/2,1),y=c(0,sqrt(3)/2,0))
text(0.07,-0.04,'A+S=1, B=0, C=0')
text(0.50,+0.90,'A+S=0, B=1, C=0')
text(0.93,-0.04,'A+S=0, B=0, C=1')

But what one is really interested in is the relative strength of player B and player C. We are indifferent between $p_B=0.1,p_C=0.2$ and $p_B=0.3,p_C=0.6$. To this end we need the profile likelihood curve of the log contrast, $X=\log\left(p_B/p_C\right)$.

logcontrast <- apply(jj,1,function(x){log(x[1]/x[2])})
plot(logcontrast,support,xlim=c(-5,5),ylim=c(-4,0))
abline(h=-2)

From the graph, we can see that the two-units-of-support interval is from about $-3$ to $+3$ which would translate into a probability of B winning against C being in the interval $\left(\frac{1}{1+e^{3}},\frac{1}{1+e^{-3}}\right)$, or about $(0.047,0.953)$. However, it is not clear if $p_B$ and $p_C$ are meaningful in isolation, as a real table tennis point has to have a server and the value of $S$ is itself uncertain.

Weighted likelihoods

We now try hyper3 formalism:

M_home <- matrix( c(NA, 5 , 4, 3, NA, 0, 2,0,NA),byrow=TRUE,3,3)
jj <- letters[1:3]
dimnames(M_home) <- list(jj,jj)

M_away <- matrix( c(NA, 1 , 1, 1, NA, 0, 1,0,NA),byrow=TRUE,3,3)
jj <- letters[1:3]
dimnames(M_away) <- list(jj,jj)
M_home
M_away

home_away3(M_home,M_away,lambda=1.8)

f <- function(lam){
    maxp(home_away3(M_home,M_away,lambda=lam),give=TRUE)$value
}

lam <- exp(seq(from=log(0.6),to=log(20),len=9))
L <- sapply(lam,f)

maxf <- optimize(f,c(2,5),maximum=TRUE)
maxf

L <- L-max(L)
plot(lam,L,type='b',pch=16,log="x")
abline(h=c(0,-2))

How does the situation change if we have some more observations? Suppose c plays at home to d, and loses. If we use $\hat{\lambda}\simeq 3.35$:

(Hnew <- home_away3(M_home,M_away,lambda=3.35) + dirichlet3(c(c=0,d=1),lambda=3.35))
maxp(Hnew)

Above we see that our estimate would be that d has a BT strength of 1 and the other players have a strength of zero. Is there strong evidence that the teams' strength differ?

equalp.test(Hnew)

Above we see that there is no evidence that the teams' strengths do in fact differ.

Package dataset

Following lines create table_tennis.rda, residing in the data/ directory of the package.

save(table_tennis,file="table_tennis.rda")

References {-}

RobinHankin/hyper2 documentation built on April 13, 2025, 9:33 a.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

RobinHankin/hyper2
The Hyperdirichlet Distribution, Mark 2

In RobinHankin/hyper2: The Hyperdirichlet Distribution, Mark 2

A vs B, A serves, score 5-1: {-}

Maximum likelihood estimate for the strengths

Test hypothesis that serve strength S is zero

Assessment of equal player strength.

Profile likelihood for serve strength

Assessment of B vs C

Profile likelihood for B vs C

Weighted likelihoods

Package dataset

References {-}

R Package Documentation

Browse R Packages

We want your feedback!

RobinHankin/hyper2 The Hyperdirichlet Distribution, Mark 2

In RobinHankin/hyper2: The Hyperdirichlet Distribution, Mark 2

A vs B, A serves, score 5-1: {-}

Maximum likelihood estimate for the strengths

Test hypothesis that serve strength S is zero

Assessment of equal player strength.

Profile likelihood for serve strength

Assessment of B vs C

Profile likelihood for B vs C

Weighted likelihoods

Package dataset

References {-}

R Package Documentation

Browse R Packages

We want your feedback!

RobinHankin/hyper2
The Hyperdirichlet Distribution, Mark 2