knitr::opts_chunk$set(echo = TRUE)
Here we consider different observations deriving from order statistics among three competitors. This document is an informal list intended to support v109i08.pdf, currently under review at the Journal of Statistical Software.
I consider the most general case first and then go on to consider more special cases. In the interests of simplicity I will often identify a runner with his strength, so that $a$ represents a particular competitor, and also his Plackett-Luce strength.
We consider three runners, $a,b,c$ and a Plackett-Luce likelihood function for the six possible orders. We normalize so $a+b+c=1$ and take $c=1-a-b$.
prob = $\frac{ab}{1-a}$
$S=\log a+\log b-\log(1-a)$
$\partial S/\partial a = a^{-1} + (1-a)^{-1}$
$\partial S/\partial b=b^{-1}$
$\partial^2 S/\partial a^2=-a^{-2} +(1-a)^{-2}\longrightarrow -pS_{aa}=\frac{(1-2a)b}{a(1-a)^3}$
$\partial^2 S/\partial b^2 = -b^{-2}\longrightarrow -pS_{bb} = \frac{a}{(1-a)b}$
$\partial^2 S/\partial a\partial b=0\longrightarrow -pS_{ab}= 0$
prob = $\frac{a(1-a-b)}{1-a}$
$S=\log a-\log(1-a) +\log(1-a-b)$
$\partial S/\partial a = a^{-1} + (1-a)^{-1} + (1-a-b)^{-1}$
$\partial S/\partial b = (1-a-b)^{-1}$
$\partial^2S/\partial a^2 = -a^{-2}+(1-a)^{-2} + (1-a-b)^{-2}\longrightarrow -pS_{aa} =\frac{(1-2a)(1-a-b)}{a(1-a)^3} -\frac{a}{1-a}=\frac{(1-2a)(1-a-b) - a^2(1-a)^2}{a(1-a)^3}$
$\partial^2S/\partial b^2=-(1-a-b)^{-2}\longrightarrow -pS_{bb}=\frac{a}{(1-a)(1-a-b)}$
$\partial^2S/\partial a\partial b = (1-a-b)^{-2}\longrightarrow -pS_{ab}=\frac{-a}{(1-a)(1-a-b)}$
prob = $\frac{ba}{1-b}$
$S=\log a +\log b-\log(1-b)$
$\partial S/\partial a = a^{-1}$
$\partial S/\partial b = b^{-1} +(1-b)^{-1}$
$\partial^2S/\partial a^2 = -a^{-2}\longrightarrow -pS_{aa}=\frac{-b}{a(1-b)}$
$\partial^2S/\partial b^2=-b^{-2} +(1-b)^{-2}\longrightarrow -pS_{bb}=\frac{ab(-1+2b)}{b^2(1-b)^3}$
$\partial^2S/\partial a\partial b=0\longrightarrow -pS_{ab}=0$
prob = $\frac{a(1-a-b)}{1-b}$
$S=\log a + \log(1-a-b) - \log(1-b)$
$\partial S/\partial a=a^{-1} -(1-a-b)^{-1}$
$\partial S/\partial b=-(1-a-b)^{-1} - (1-b)^{-1}$
$\partial^2S\partial a^2=-a^{-2} -(1-a-b)^{-2}\longrightarrow -pS_{aa}= \frac{a^4 + 4 a^3 b + 6 a^2 b^2 - 4 a^2 b + 4 a b^3 - 6 a b^2 + 2 a b + b^4 - 2 b^3 + b^2 }{a^2 (a + b - 1)^2 (a + b)^2}$
$\partial^2S/\partial b^2 =-(1-a-b)^{-2} -(1-b)^{-2}$
$\partial^2S/\partial a\partial b=-(1-a-b)^{-2}$
prob=$\frac{a(1-a-b)}{a+b}$
$S=\log a + \log(1-a-b) -\log(a+b)$
$\partial S/\partial a=a^{-1} -(1-a-b)^{-1} -(a+b)^{-1}$
$\partial S/\partial b=-(1-a-b)^{-1}-(a+b)^{-1}$
$\partial^2S/\partial a^2=-a^{-2} -(1-a-b)^{-2} +(a+b)^{-2}$
$\partial^2S/\partial b^2=-(1-a-b)^{-2} +(a+b)^{-2}$
$\partial^2S/\partial a\partial b=-(1-a-b)^{-2}+(a+b)^{-2}$
prob=$\frac{b(1-a-b)}{1-b}$
$S=\log b+\log(1-a-b) -\log(1-b)$
$\partial S/\partial a=-(1-a-b)^{-1}$
$\partial S/\partial b=b^{-1}+(1-b)^{-1} -(1-a-b)^{-1}$
$\partial^2S/\partial a^2=-(1-a-b)^{-2}$
$\partial^2S/\partial b^2=-b^{-2} +(1-b)^{-2}-(1-a-b)^{-2}$
$\partial^2S/\partial a\partial b=-(1-a-b)^{-2}$
We want to consider the maximum likelihood estimator. Take the six possible observations in turn:
$o=a\succ b\succ c\longrightarrow P(o)=\frac{ab}{1-a};\hat{a}=1,\hat{b}=0$
$o=a\succ c\succ b\longrightarrow P(o)=\frac{a(1-a-b)}{1-a};\hat{a}=1,\hat{b}=0$.
$o=b\succ a\succ c\longrightarrow P(o)=\frac{ab}{1-b};\hat{a}=0,\hat{b}=1$.
$o=b\succ c\succ a\longrightarrow P(o)=\frac{b(1-a-b)}{1-b};\hat{a}=0,\hat{b}=1$.
$o=c\succ a\succ b\longrightarrow P(o)=\frac{a(1-a-b)}{1-b};\hat{a}=0,\hat{b}=0$.
$o=c\succ b\succ a\longrightarrow P(o)=\frac{a(1-a-b)}{1-b};\hat{a}=0,\hat{b}=0$.
So we see that $\mathbb{E}(\hat{a})=\frac{ab}{1-a}+\frac{a(1-a-b)}{1-a}=a$ and $\mathbb{E}(\hat{b})=\frac{ab}{1-b}+\frac{b(1-a-b)}{1-b}=b$, so the maximum likelihood estimator is unbiased [NB $\mathbb{E}(\hat{c})=c$].
For mean squared error we seek $\mathbb{E}\left| (a,b)-(\hat{a},\hat{b})\right|^2=\mathbb{E}\left[(a-\hat{a})^2 + (b-\hat{b})^2\right]= \mathbb{E}\hat{a}^2 +\mathbb{E}\hat{b}^2-a^2-b^2=a(1-a)+b(1-b)$ [because $\mathbb{E}\hat{a}^2=a$ and $\mathbb{E}\hat{b}^2=b$].
Alternatively, we might define the mean squared error to be $\mathbb{E}\left| (a,b,c)-(\hat{a},\hat{b},\hat{c})\right|^2$ and get $a(1-a)+b(1-b)+c(1-c)$ or $2a(1-a)+2b(1-b)-2ab$.
For the Fisher information matrix, we seek $M$, the two-by-two matrix with entries $\sum_{\sigma\in\left\lbrace a\succ b\succ c,\ldots, c\succ b\succ a\right\rbrace}\operatorname{Prob}(\sigma)\frac{\partial^2\log\operatorname{Prob}(\sigma)}{\partial x\partial y}$, where $x,y\in\left\lbrace a,b\right\rbrace$. Then the Fisher information is $\det(M)$. The whole thing is a bit of a nightmare algebraically but we can use mathematica to help.
p1 = a*b/(1-a) p2 = a*(1-a-b)/(1-a) p3 = a*b/(1-b) p4 = b*(1-a-b)/(1-b) p5 = a*(1-a-b)/(a+b) p6 = b*(1-a-b)/(a+b) Faa = ( -p1*D[Log[p1],a,a] -p2*D[Log[p2],a,a] -p3*D[Log[p3],a,a] -p4*D[Log[p4],a,a] -p5*D[Log[p5],a,a] -p6*D[Log[p6],a,a] ) Fab = ( -p1*D[Log[p1],a,b] -p2*D[Log[p2],a,b] -p3*D[Log[p3],a,b] -p4*D[Log[p4],a,b] -p5*D[Log[p5],a,b] -p6*D[Log[p6],a,b] ) Fba = ( -p1*D[Log[p1],b,a] -p2*D[Log[p2],b,a] -p3*D[Log[p3],b,a] -p4*D[Log[p4],b,a] -p5*D[Log[p5],b,a] -p6*D[Log[p6],b,a] ) Fbb = ( -p1*D[Log[p1],b,b] -p2*D[Log[p2],b,b] -p3*D[Log[p3],b,b] -p4*D[Log[p4],b,b] -p5*D[Log[p5],b,b] -p6*D[Log[p6],b,b] ) Minimize[{Faa*Fbb-Fab*Fba,a>0,b>0,a+b<1},{a,b}]
gives 1323/16
.
Now consider the same situation but our observation is purely the loser in a race. With $a,b,c, a+b+c=1$ or $c=1-a-b$ we have the probabilities of:
A loses: either $b\succ c\succ a$ or $c\succ b\succ a$:
$$\frac{b}{a+b+c}\cdot\frac{c}{a+c}+\frac{c}{a+b+c}\cdot\frac{b}{a+b}= bc\left(\frac{1}{1-b}+\frac{1}{1-c}\right) $$
B loses: either $a\succ c\succ b$ or $c\succ a\succ c$:
$$\frac{a}{a+b+c}\cdot\frac{c}{b+c}+\frac{c}{a+b+c}\cdot\frac{a}{a+b}= ac\left(\frac{1}{1-a}+\frac{1}{1-c}\right) =ac\left(\frac{1}{1-a}+\frac{1}{1-c}\right) $$
C loses: either $a\succ b\succ c$ or $b\succ a\succ c$, mutually exclusive, probabilities add:
$$\frac{a}{a+b+c}\cdot\frac{b}{b+c}+\frac{b}{a+b+c}\cdot\frac{a}{a+c}= ab\left(\frac{1}{1-a}+\frac{1}{1-b}\right)$$
Interestingly, there is no well-defined maximum likelihood estimator from this data. The likelihood functions for the three observations do not have a well-defined maximum. If we try to find a symmetric estimator, specifically one in which
then there is no unbiased estimator: there is no value of $\alpha$ for which $\mathbb{E}\left(\hat{a},\hat{b},\hat{c}\right)=(a,b,c)$. I think that's quite interesting. How about minimizing the MSE? Defined as $\mathbb{E}\left| (a,b)-(\hat{a},\hat{b})\right|^2$, we are minimizing
$$ P(A \mbox{ loses})\cdot((a-0)^2 +(b-\alpha)^2) + P(B \mbox{ loses})\cdot((a-\alpha)^2 +(b-0)^2) + P(C \mbox{ loses})\cdot((a-\alpha)^2 +(b-\alpha)^2) $$
$$ = b(1-a-b)\left(\frac{1}{1-b}+\frac{1}{a+b}\right)\cdot(a^2+(b-\alpha)^2)+ a(1-a-b)\left(\frac{1}{1-a}+\frac{1}{a+b}\right)\cdot((a-\alpha)^2+b^2) + ab \left(\frac{1}{1-a}+\frac{1}{1-b}\right)\cdot((a-\alpha)^2+(b-\alpha)^2 $$
Or, using the other definition [where $c=1-a-b$]
$$ P(A \mbox{ loses})\cdot((a-0)^2 +(b-\alpha)^2 + (c-\alpha)^2)) + P(B \mbox{ loses})\cdot((a-\alpha)^2 +(b-0)^2 + (c-\alpha)^2)) + P(C \mbox{ loses})\cdot((a-\alpha)^2 +(b-\alpha)^2 + c^2) $$
but neither of these has a minimum independent of $a,b,c$. For the Fisher information, from mathematica:
c = 1-a-b pc = a*b*(1/(1-a) + 1/(1-b)) pb = a*c*(1/(1-a) + 1/(1-c)) pa = b*c*(1/(1-b) + 1/(1-c)) Flaa = ( -pa*D[Log[pa],a,a] -pb*D[Log[pb],a,a] -pc*D[Log[pc],a,a] ) Flab = ( -pa*D[Log[pa],a,b] -pb*D[Log[pb],a,b] -pc*D[Log[pc],a,b] ) Flba = ( -pa*D[Log[pa],b,a] -pb*D[Log[pb],b,a] -pc*D[Log[pc],b,a] ) Flbb = ( -pa*D[Log[pa],b,b] -pb*D[Log[pb],b,b] -pc*D[Log[pc],b,b] ) Minimize[{Faa*Fbb-Fab*Fba,a>0,b>0,a+b<1},{a,b}]
This gives $\frac{16875}{256}\simeq 65.918$.
There are four possible outcomes:
$a\succ b\succ\left\lbrace a,b\right\rbrace$
$b\succ a\succ\left\lbrace a,b\right\rbrace$
$a\succ a\succ\left\lbrace b,b\right\rbrace$
$b\succ b\succ\left\lbrace a,a\right\rbrace$.
Call
these S1
, S2
, S3
and S4
respectively. Mathematica code:
S1 = a*(1-a)/(2-a)*2 S2 = a*(1-a)/(1+a)*2 S3 = a^2/(2-a) S4 = (1-a)^2/(1+a) FI = -( S1*D[Log[S1],a,a]+ S2*D[Log[S2],a,a]+ S3*D[Log[S3],a,a]+ S4*D[Log[S4],a,a] )//FullSimplify Minimize[{FI,a>0,a<1},{a}]
We get 68/9 at $a=1/2$. In more detail we have
FI <- function(a){ (12 + (1-a)* a* ((1-a)*a-10)) / ((2-a)^2*(1-a)*a*(1+a)^2)} p <- seq(from=0.1,to=0.90,len=40) plot(p,FI(p))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.