Description Usage Arguments Details Value Note Author(s) References Examples
Compute the softmax log likelihood and gradient of the same.
1 2 3 |
g |
a vector giving the group indices. |
idx |
a vector of integers, the same length as |
eta |
a vector of the odds.
Must be the same length as |
wt |
an optional vector of non-negative elements, the same length
as |
deleta |
an optional matrix whose row count equals the number of elements
of |
gamma |
a vector of the gamma parameters. It is assumed that the first element is γ_2, while the last element is applied to all higher order tie breaks. |
Given vectors g, η and optionally the gradient of η with respect to some coefficients, computes the log likelihood under the softmax. The user must provide a reverse ordering as well, which is sorted first by the groups, g, and then, within a group, in increasing quality order. For a race, this means that the index is in order from the last place to the first place in that race, where the group numbers correspond to one race.
Under the Harville model, the log likelihood on a given group, where we are indexing in forward order, is
≤ft(η_1 - \log ∑_{j ≥ 1} μ_j\right) + ≤ft(η_2 - \log ∑_{j ≥ 2} μ_j\right) + ...
where μ_i = \exp{η_i}. By “forward order”, we mean that η_1 corresponds to the participant taking first place within that group, η_2 took second place, and so on.
The Henery model log likelihood takes the form
≤ft(η_1 - \log ∑_{j ≥ 1} μ_j\right) + ≤ft(γ_2 η_2 - \log ∑_{j ≥ 2} μ_j^{γ_2}\right) + ...
for gamma parameters, γ. The Henery model corresponds to the Harville model where all the gammas equal 1.
Weights in weighted estimation apply to each summand. The weight for the last place participant in a group is irrelevant. The weighted log likelihood under the Harville model is
w_1≤ft(η_1 - \log ∑_{j ≥ 1} μ_j\right) + w_2≤ft(η_2 - \log ∑_{j ≥ 2} μ_j\right) + ...
One should think of the weights as applying to the outcome, not the participant.
The log likelihood. If deleta
is given, we add an attribute
to the scalar number, called gradient
giving the derivative.
For the Henery model we also include a term of gradgamma
which is
the gradient of the log likelihood with respect to the gamma vector.
The likelihood function does not yet support ties.
To avoid incorrect inference when only the top performers are recorded, and all others are effectively tied, one should use weighting. Set the weights to zero for participants who are tied non-winners, and one for the rest So for example, if you observe the Gold, Silver, and Bronze medal winners of an Olympic event that had a starting field of 12 participants, set weights to 1 for the medal winners, and 0 for the others. Note that the weights do not attach to the participants, they attach to the place they took.
Steven E. Pav shabbychef@gmail.com
Harville, D. A. "Assigning probabilities to the outcomes of multi-entry competitions." Journal of the American Statistical Association 68, no. 342 (1973): 312-316. http://dx.doi.org/10.1080/01621459.1973.10482425
Henery, R. J. "Permutation probabilities as models for horse races." Journal of the Royal Statistical Society: Series B (Methodological) 43, no. 1 (1981): 86-91. http://dx.doi.org/10.1111/j.2517-6161.1981.tb01153.x
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 | # a garbage example
set.seed(12345)
g <- as.integer(sort(ceiling(20 * runif(100))))
idx <- as.integer(rev(1:length(g)) - 1L)
eta <- rnorm(length(g))
foo <- harsmlik(g=g,idx=idx,eta=eta,deleta=NULL)
# an example with a real idx
nfeat <- 5
set.seed(1234)
g <- ceiling(seq(0.1,1000,by=0.1))
X <- matrix(rnorm(length(g) * nfeat),ncol=nfeat)
beta <- rnorm(nfeat)
eta <- X %*% beta
y <- rsm(eta,g)
idx <- order(g,y,decreasing=TRUE) - 1
foores <- harsmlik(g,idx,eta,deleta=X)
# now reweight them
wt <- runif(length(g))
wt <- wt / mean(wt) # mean 1 is recommended
foores <- harsmlik(g,idx,eta,wt=wt)
# try hensmlik
foores <- hensmlik(g,idx,eta,gamma=c(0.9,0.8,1),wt=wt)
# check the value of the gradient by numerical approximation
nfeat <- 8
set.seed(321)
g <- ceiling(seq(0.1,1000,by=0.1))
X <- matrix(rnorm(length(g) * nfeat),ncol=nfeat)
beta <- rnorm(nfeat)
eta <- X %*% beta
y <- rsm(eta,g)
idx <- order(g,y,decreasing=TRUE) - 1
if (require(numDeriv)) {
fastval <- attr(harsmlik(g,idx,eta,deleta=X),'gradient')
numap <- grad(function(beta,g,idx,X) {
eta <- X %*% beta
as.numeric(harsmlik(g,idx,eta))
},
x=beta,g=g,idx=idx,X=X)
rbind(fastval,numap)
}
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.