gradient | R Documentation |
Given a hyper2
object and a point in probability space,
function gradient()
returns the gradient of the log-likelihood;
function hessian()
returns the bordered Hessian matrix. By
default, both functions are evaluated at the maximum likelihood estimate
for p, as given by maxp()
.
gradient(H, probs=indep(maxp(H))) hessian(H,probs=indep(maxp(H)),border=TRUE) hessian_lowlevel(L, powers, probs, pnames,n) is_ok_hessian(M, give=TRUE)
H |
A |
L,powers,n |
Components of a |
probs |
A vector of probabilities |
pnames |
Character vector of names |
border |
Boolean, with default |
M |
A bordered Hessian matrix, understood to have a single
constraint (the unit sum) at the last row and column; the output of
|
give |
Boolean with default |
Function gradient()
returns the gradient of the log-likelihood
function. If the hyper2
object is of size n, then argument
probs
may be a vector of length n-1 or n; in the
former case it is interpreted as indep(p)
. In both cases, the
returned gradient is a vector of length n-1.
The function returns the derivative of the loglikelihood with respect to
the n-1 independent components of
\mjeqn\left(p_1,...,p_n\right)(p_1,...,p_n), namely
\mjeqn\left(p_1,...,p_n-1\right)(p_1,...,p_n-1). The fillup
value \mjseqnp_n is calculated as
\mjeqn1-\left(p_1+\cdots + p_n-1\right)1-(p_1+...+p_n-1).
Function gradientn()
returns the gradient of the loglikelihood
function but ignores the unit sum constraint. If the hyper2
object is of size n, then argument probs
must be a vector
of length n, and the function returns a named vector of length
n. The last element of the vector is not treated differently from
the others; all n elements are treated as independent. The sum
need not equal one.
Function hessian()
returns the bordered Hessian, a matrix
of size \mjeqnn+1\times n+1(n+1)*(n+1), which is useful when using
Lagrange's method of undetermined multipliers. The first row and column
correspond to the unit sum constraint, \mjeqn\sum p_1=1p_1+...+p_n=1.
Row and column names of the matrix are the pnames()
of the
hyper2
object, plus “usc
” for “Unit Sum
Constraint”.
The unit sum constraint borders could have been added with idiom
magic::adiag(0,pad=1,hess)
, which might be preferable.
Function is_ok_hessian()
returns the result of the second
derivative test for the maximum likelihood estimate being a local
maximum on the constraint hypersurface. This is a generalization of the
usual unconstrained problem, for which the test is the Hessian's being
negative-definite.
Function hessian_lowlevel()
is a low-level helper function that
calls the C++
routine.
Further examples and discussion is given in file
inst/gradient.Rmd
. See also the discussion at man/maxp.Rd
on the different optimization routines available.
Function gradient()
returns a vector of length n-1 with
entries being the gradient of the log-likelihood with respect to the
n-1 independent components of
\mjeqn\left(p_1,...,p_n\right)(p_1,...,p_n), namely
\mjeqn\left(p_1,...,p_n-1\right)(p_1,...,p_n-1). The fillup
value \mjseqnp_n is calculated as
\mjeqn1-\left(p_1,...,p_n-1\right)1-(p_1,...,p_n-1).
If argument border
is TRUE
, function hessian()
returns an n-by-n matrix of second derivatives; the borders
are as returned by gradient()
. If border
is FALSE
,
ignore the fillup value and return an n-1-by-n-1 matrix.
Calling hessian()
at the evaluate will not return exact zeros for
the constraint on the fillup value; gradient()
is used and this
does not return exactly zeros at the evaluate.
Robin K. S. Hankin
data(chess) p <- c(1/2,1/3) delta <- rnorm(2)/1e5 # delta needs to be quite small deltaL <- loglik(p+delta,chess) - loglik(p,chess) deltaLn <- sum(delta*gradient(chess,p + delta/2)) # numeric deltaL - deltaLn # should be small [zero to first order] H <- hessian(icons) is_ok_hessian(H)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.