knitr::opts_chunk$set(echo = TRUE)

`hyper2`

packageWe consider the likelihood function given by a single observation of
an order statistic. Here we investigate some numerical problems
arising from the use of `optim()`

and `constrOptim()`

. We will
consider $n=10$ competitors and without loss of generality, the
competitors finish in order $1,2,3,\ldots,10$. The likelihood
function for this observation is:

[ {\mathcal L}\left(p_1,\ldots,p_{10}\right)= \frac{p_1}{p_1+\cdots+p_{10}}\cdot \frac{p_2}{p_2+\cdots+p_{10}}\cdot \frac{p_3}{p_3+\cdots+p_{10}}\cdot \frac{p_4}{p_4+\cdots+p_{10}}\cdot \frac{p_5}{p_5+\cdots+p_{10}}\cdot\ \frac{p_6}{p_6+\cdots+p_{10}}\cdot \frac{p_7}{p_7+\cdots+p_{10}}\cdot \frac{p_8}{p_8+p_9+p_{10}}\cdot \frac{p_9}{p_9+p_{10}}\cdot \frac{p_{10}}{p_{10}} ]

We can calculate the evaluate (that is, the maximum likelihood estimate) from the formula by inspection. It is clear from the penultimate term that $p_{10}=0$. Similarly from the preceding term it is easy to see that $p_9=0$ and so on. The evaluate is thus $p=(1,0,\ldots,0)$. Let's see if we can reproduce this using numerical methods. Note that this is a highly pathological case; the likelihood is not defined at the evaluate. Even considering limiting cases is not straightforward; consider two processes, approaching the evaluate in two different ways:

[ \lim_{x\longrightarrow 0} {\mathcal L}(1-9x,x,x,x,x,x,x,x,x,x,x)=\frac{1}{8!};\qquad \lim_{x\longrightarrow 0} {\mathcal L}(\alpha,\alpha x,\alpha x^2,\ldots,\alpha x^9)=1 ]

where $\alpha=(1-x^{10})/(1-x)$ is a normalizing constant. Thus the
likelihood *at* the evaluate is not defined, even as a limit (because
it depends on the direction of approach). However, it is reasonable
to demand consistency in the limiting process. Thus, writing
$p_i(x),1\leq i\leq 10$, we require

[ \lim_{x\longrightarrow 0}\frac{p_i(x)}{p_{i+1}(x)}=0,\qquad i=1,\ldots,9 ]

and if this is respected, then it is reasonable to say that the likelihood at the evaluate is 1, and the support 0.

The `hyper2`

idiom for creating a likelihood function is as follows:

library("hyper2") H <- race(letters[1:10]) H

And then maximizing it is carried out using the `maxp()`

function:

evaluate1 <- maxp(H) (maxlike1 <- loglik(indep(evaluate1),H)) dotchart(evaluate1,pch=16)

Comparing the numerically determined maximum likelihood estimate with the exact MLE of $(1,0,\ldots,0)$ shows that the numerics have not done a particularly good job in identifying the evaluate, presumably as a result of the difficulty of honouring the constraints. However, if we give the system a hand by starting at a point closer to the evaluate we can do slightly better:

f <- function(p,n){c(p,rep((1-p)/n,n-1))} sss <- f(0.99,9) # sensible start point, close (?) to the evaluate evaluate2 <- maxp(H,startp=sss) maxlike2 <- loglik(indep(evaluate2),H)

Graphically:

dotchart(evaluate2,pch=16) par(pty='s') plot(evaluate1,evaluate2,asp=1,log='xy') abline(0,1)

So it is almost identical to the coldstart above in the sense that the two evaluates are very close. We can compare the hotstart to the coldstart in terms of support:

```
maxlike2-maxlike1
```

This is almost exactly zero, but negative: the so-called "sensible
start point" has produced an inferior solution. This is because `sss`

has $p_2=p_3=\ldots = p_{10}$, so predicts a draw between players
$2,3,\ldots 10$, contrary to observation.

Possibly, a better way is to mimic the more appropriate of the two limiting processes above:

vsssp <- function(x,n){ jj <- 1/x^seq_len(n) indep(jj/sum(jj)) }

Then

evaluate3 <- maxp(H,vsssp(2,10)) maxlike3 <- loglik(indep(evaluate3),H) dotchart(evaluate3,pch=16)

Observe that the likelihood at this point is only very slightly better
than the `equalp()`

start point:

```
maxlike3-maxlike2
```

I am not sure how important these considerations are in practice. Firstly, the order statistic used here is extremely pathological, and perhaps it is unreasonable to expect any inferential tool to work under these circumstances.

Secondly, and perhaps more importantly, likelihood is not defined at a point; likelihood is an equivalence class of functions whose domain is possible hypotheses. We can only compare likelihoods for two different hypotheses [because likelihood is only defined up to a multiplicative constant]. And for the observation considered here, the evaluate $(1,0,\ldots,0)$ is not a hypothesis anyone would want a likelihood for: the likelihood is not defined at that point, and does not even have a well-defined limit there.

To some extent this is not surprising: the numerics force one to have a strictly positive minimum value for each $p_i$ and the fillup value $p_{10}=1-\sum p_i$. The likelihood maximization routine tries to maximize $p_9/p_{10}, p_8/p_9,\ldots,p_1/p_2$ simultaneously, all subject to the unit sum constraint $\sum p_i=1$. And of course, if the ratio is too high then the unit sum constraint would be violated (given that $p_{10}$ is at its minimum value).

**Any scripts or data that you put into this service are public.**

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.