smlik: Softmax log likelihood under Harville and Henery Models.
In ohenery: Modeling of Ordinal Random Variables via Softmax Regression

smlik

R Documentation

Softmax log likelihood under Harville and Henery Models.

Description

Compute the softmax log likelihood and gradient of the same.

Usage

harsmlik(g, idx, eta, wt = NULL, deleta = NULL)

hensmlik(g, idx, eta, gamma, wt = NULL, deleta = NULL)

Arguments

`g`	a vector giving the group indices.
`idx`	a vector of integers, the same length as `g`, which describes the reverse sort order on the observations, first by group, then by place within the group. That is, the element `idx[0]` is the index of the last place finisher in the group `g[0]`; then `idx[1]` is the index of the next-to-last place finisher in `g[1]` (assuming it equals `g[0]`), and so on. The `idx` shall be zero based.
`eta`	a vector of the odds. Must be the same length as `g`.
`wt`	an optional vector of non-negative elements, the same length as `g`, giving the observation level weights. We then compute a weighted log likelihood, where the weights are in each summand. The weights should probably have mean 1, but that's just, like, my opinion, man. Negative weights throw an error. Note that the weights for the last place in each group have no effect on the computation.
`deleta`	an optional matrix whose row count equals the number of elements of `g`, `idx`, and `eta`. The rows of `deleta` are the derivatives of eta with respect to some `z`. This is used to then maximize likelihood with respect to `z`.
`gamma`	a vector of the gamma parameters. It is assumed that the first element is `\gamma_2`, while the last element is applied to all higher order tie breaks.

Details

Given vectors g, \eta and optionally the gradient of \eta with respect to some coefficients, computes the log likelihood under the softmax. The user must provide a reverse ordering as well, which is sorted first by the groups, g, and then, within a group, in increasing quality order. For a race, this means that the index is in order from the last place to the first place in that race, where the group numbers correspond to one race.

Under the Harville model, the log likelihood on a given group, where we are indexing in forward order, is

\left(\eta_1 - \log \sum_{j \ge 1} \mu_j\right) + \left(\eta_2 - \log \sum_{j \ge 2} \mu_j\right) + ...

where \mu_i = \exp{\eta_i}. By “forward order”, we mean that \eta_1 corresponds to the participant taking first place within that group, \eta_2 took second place, and so on.

The Henery model log likelihood takes the form

\left(\eta_1 - \log \sum_{j \ge 1} \mu_j\right) + \left(\gamma_2 \eta_2 - \log \sum_{j \ge 2} \mu_j^{\gamma_2}\right) + ...

for gamma parameters, \gamma. The Henery model corresponds to the Harville model where all the gammas equal 1.

Weights in weighted estimation apply to each summand. The weight for the last place participant in a group is irrelevant. The weighted log likelihood under the Harville model is

w_1\left(\eta_1 - \log \sum_{j \ge 1} \mu_j\right) + w_2\left(\eta_2 - \log \sum_{j \ge 2} \mu_j\right) + ...

One should think of the weights as applying to the outcome, not the participant.

Value

The log likelihood. If deleta is given, we add an attribute to the scalar number, called gradient giving the derivative. For the Henery model we also include a term of gradgamma which is the gradient of the log likelihood with respect to the gamma vector.

Note

The likelihood function does not yet support ties.

To avoid incorrect inference when only the top performers are recorded, and all others are effectively tied, one should use weighting. Set the weights to zero for participants who are tied non-winners, and one for the rest So for example, if you observe the Gold, Silver, and Bronze medal winners of an Olympic event that had a starting field of 12 participants, set weights to 1 for the medal winners, and 0 for the others. Note that the weights do not attach to the participants, they attach to the place they took.

Author(s)

Steven E. Pav shabbychef@gmail.com

References

Harville, D. A. "Assigning probabilities to the outcomes of multi-entry competitions." Journal of the American Statistical Association 68, no. 342 (1973): 312-316. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1080/01621459.1973.10482425")}

Henery, R. J. "Permutation probabilities as models for horse races." Journal of the Royal Statistical Society: Series B (Methodological) 43, no. 1 (1981): 86-91. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1111/j.2517-6161.1981.tb01153.x")}

Examples

# a garbage example
set.seed(12345)
g <- as.integer(sort(ceiling(20 * runif(100))))
idx <- as.integer(rev(1:length(g)) - 1L)
eta <- rnorm(length(g))
foo <- harsmlik(g=g,idx=idx,eta=eta,deleta=NULL)

# an example with a real idx
nfeat <- 5
set.seed(1234)
g <- ceiling(seq(0.1,1000,by=0.1))
X <- matrix(rnorm(length(g) * nfeat),ncol=nfeat)
beta <- rnorm(nfeat)
eta <- X %*% beta
y <- rsm(eta,g)

idx <- order(g,y,decreasing=TRUE) - 1
foores <- harsmlik(g,idx,eta,deleta=X)

# now reweight them
wt <- runif(length(g))
wt <- wt / mean(wt)   # mean 1 is recommended
foores <- harsmlik(g,idx,eta,wt=wt)

# try hensmlik 
foores <- hensmlik(g,idx,eta,gamma=c(0.9,0.8,1),wt=wt)

# check the value of the gradient by numerical approximation

 nfeat <- 8
 set.seed(321)
 g <- ceiling(seq(0.1,1000,by=0.1))
 X <- matrix(rnorm(length(g) * nfeat),ncol=nfeat)
 beta <- rnorm(nfeat)
 eta <- X %*% beta
 y <- rsm(eta,g)
 
 idx <- order(g,y,decreasing=TRUE) - 1
 if (require(numDeriv)) {
 
 	fastval <- attr(harsmlik(g,idx,eta,deleta=X),'gradient')
 	numap <- grad(function(beta,g,idx,X) { 
 			 eta <- X %*% beta
 			 as.numeric(harsmlik(g,idx,eta))
 			 },
 			 x=beta,g=g,idx=idx,X=X)
 	rbind(fastval,numap)
 }

ohenery documentation built on June 8, 2025, 11:49 a.m.