knitr::opts_chunk$set(echo = TRUE) library("hyper2") library("partitions")
knitr::include_graphics(system.file("help/figures/hyper2.png", package = "hyper2"))
To cite the hyper2
package in publications, please use @hankin2017_rmd.
Here we will sum probabilities across a sample space and verify that
the total is constant. In theory the sum of the probabilities of all
members of the sample space is 1, but in practice [and certainly in
hyper2
idiom], the likelihood calculation includes an indeterminate
constant which means that all we can say is that the total probability
is not a function of the parameters of the distribution.
Here I present some relatively informal and unstructured thoughts on
this issue from a numerical and hyper2
perspective.
M <- apply(partitions::multiset(c(1,1,2,2,3)),2,function(x){letters[x]}) colnames(M) <- rep("",ncol(M)) # tidier and more compact appearance M <- noquote(M) M
Above, we use multiset()
to enumerate the sample space of five
competitors comprising two twins denoted a
, two twins denoted b
,
and a singleton c
. This document discusses the issue of whether the
twins are identical twins and hence indistinguishable. Matrix M
has
${5\choose2\,2\,1}=30$ columns, one per possible result. We will
calculate the likelihood of each column and sum them to get the
probability of observing any one of the 30 order statistics [the
observations are mutually exclusive, so
$\operatorname{Prob}\left(x_1\cup\cdots\cup x_n\right) =
\operatorname{Prob}\left(x_1\right)+\cdots
+\operatorname{Prob}\left(x_n\right)$]:
tfun <- function(help,strength,M){ out <- 0 for(i in seq_len(ncol(M))){ vec <- M[,i] out <- out + loglik(strength,cheering3(v=vec,e=c(a=1,b=2,c=3),help=help),log=FALSE) } return(out) }
print(tfun(help=(1:3)/10, strength= c(a=0.1,b=0.3,c=0.6),M)) print(tfun(help=(7:5)/22, strength= c(a=0.3,b=0.2,c=0.5),M))
Above we see that the sum of the likelihoods is constant. The value
turns out to be $\frac{1}{1!2!2!}=0.25$. This is because the
likelihood function (being Plackett-Luce) considers the observation to
be one of the $5!=120$ ways of ordering five distinct competitors.
Matrix M
has only $\frac{5!}{2!2!1!}=30$ columns, each one of which
corresponds to four different observations, all of which have the same
probability. Much of the confusion arises from the practice of
identifying a competitor with his strength; we write "$p_a$'' [or
sometimes just "$a$''] to signify both a particular competitor, and
his Bradley-Terry strength.
To explain the factor of $1/4$ discussed above we need some new
notation. We have five competitors who I will call $\mathbf{do}$,
$\mathbf{re}$, $\mathbf{me}$, $\mathbf{fa}$, and $\mathbf{sol}$. Now
$p_X$ means "the Bradley Terry strength of $X$''. The setup here is
that $\mathbf{do}$ and $\mathbf{re}$ are twins, and that $\mathbf{me}$
and $\mathbf{fa}$ are twins (and $\mathbf{sol}$ is a singleton). We
have that the Bradley-Terry strength of $\mathbf{do}$ and
$\mathbf{re}$ is a
, the strength of $\mathbf{me}$ and $\mathbf{fa}$
is b
, and the strength of $\mathbf{sol}$ is c
.
Consider, for example, the column of M
reading a c b b a
. In
theory, this means the competitor who came first had strength a
, the
competitor who came second had strength c
. So the order statistic
is one of the $2!2!1!=4$ following orders:
$$\mathbf{do}\succ\mathbf{sol}\succ\mathbf{me}\succ\mathbf{fa}\succ\mathbf{re}$$ $$\mathbf{do}\succ\mathbf{sol}\succ\mathbf{fa}\succ\mathbf{me}\succ\mathbf{re}$$ $$\mathbf{re}\succ\mathbf{sol}\succ\mathbf{me}\succ\mathbf{fa}\succ\mathbf{do}$$ $$\mathbf{re}\succ\mathbf{sol}\succ\mathbf{fa}\succ\mathbf{me}\succ\mathbf{do}$$
Above we see that $\mathbf{do}$ and $\mathbf{re}$ come first and last (or last and first), and $\mathbf{me}$ and $\mathbf{fa}$ come third and fourth (or fourth and third), while $\mathbf{sol}$ comes second. All four possibilities have the same probability and all four are admissible order statistics for the five solfeggio syllables.
But the hyper2
package [specifically the loglik()
and
cheering3()
functions] consider a c b b a
to mean something
different. It considers it to be an observation of
$$\mathbf{do}\succ\mathbf{sol}\succ\mathbf{me}\succ\mathbf{fa}\succ\mathbf{re}$$
followed by an identification of $\mathbf{do}$ with (BT strength) a
,
$\mathbf{sol}$ with strength c
, and so on. Any of the four
possibilities above would be equally admissible, and they all have the
same probability. So to get "real'' probabilities [real in the sense
that summing the probabilities over the columns in matrix M
], we
have to multiply by four to include all the admissible order
statistics.
The difference between these two interpretations does not matter from a likelihood perspective as likelihood is defined only up to a multiplicative constant.
rs <- rp(100,dirichlet(c(a=1,b=1,c=1))) # random strengths colnames(rs) <- letters[1:3] head(rs) f <- function(s){ jj <- s names(jj) <- letters[1:3] tfun(help=1:3,strength=jj,M=M) } x <- apply(rs,1,f)
hist(x-mean(x))
Above we see very tight clustering about the mean value of 0.25.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.