objfun | R Documentation |
A function factory that returns an R function that evaluates the objective function for fitting aster models with random effects
objfun.factory(fixed, random, response, pred, fam, root, zwz,
famlist = fam.default(), offset, standard.deviation = FALSE,
deriv = 0)
fixed |
the model matrix for fixed effects. |
random |
either a model matrix or list of model matrices. Each model matrix operates on the random effects for one variance component. See Details section for more. |
response |
a vector specifying the response. Length is the row dimension of all model matrices. |
pred |
an integer vector of length |
fam |
an integer vector of length |
root |
a vector whose length is the row dimension of all model matrices. For nodes whose predecessors are root nodes specifies the value of the constant at that root node. Typically the vector having all components equal to one. Always positive integer valued. If components greater than one, then graph for that "individual" in scare quotes is actually for data for multiple individuals. |
zwz |
a positive definite symmetric matrix (neither symmetry nor
positive definiteness is checked, but incorrect answers with no error
or warning will result if condition is not met)
whose dimensions are the number of random effects. Should be
|
famlist |
a list of family specifications (see |
offset |
a vector whose length is the row dimension of all model
matrices. Distinguished point in parameter space. May be missing,
in which case an unspecified default is provided. See details of
|
standard.deviation |
scalar logical. If |
deriv |
0, 1, or 2. Number of derivatives. Zero means only the function value, one means gradient, two means Hessian. |
See the help page for the function aster
for specification
of aster models. This function only fits unconditional aster models
(those with default values of the aster
function arguments
type
and origin.type
.
The only difference between this function and functions aster
and mlogl
is
that some effects (also called coefficients, also called canonical affine
submodel canonical parameters) are treated as random.
The unconditional canonical
parameter vector of the (saturated) aster model is treated as
an affine function of fixed and random effects
\varphi = a + M \alpha + \sum_{i = 1}^k Z_i b_i
where M
and the Z_i
are model matrices specified by
the arguments fixed
and random
, where \alpha
is a vector of
fixed effects and each b_i
is a vector of random
effects that are assumed to be (marginally) normally distributed with
mean vector zero and variance matrix \nu_i
times
the identity matrix. The random effects vectors b_i
are
assumed independent.
In what follows, Z
is the single matrix produced
by cbind
'ing the Z_i
,
and b
is the single vector produced by c
'ing
the b_i
,
and \nu
is the single vector produced by c
'ing the
\nu_i
, and D
is the variance-covariance matrix
of b
, which is diagonal with all diagonal components equal to
some component of \nu
.
We can write
D = \sum_k \nu_k E_k
where the nus are unknown parameters to estimate and the Es are known
diagonal matrices whose components are zero-or-one-valued having the
property that they sum to the identity matrix and multiply to the zero
matrix (so each component of b
is exactly one component of
\nu
).
The R function returned by this function evaluates
p_W(\alpha, b, \nu) =
- l(\varphi) + \tfrac{1}{2} b^T D^{-1} b + \tfrac{1}{2}
\log\mathop{\rm det}\left(Z^T W Z D + \text{Id} \right)
where l
is the log likelihood for the saturated aster model specified
by arguments pred
, fam
, and famlist
,
where \varphi
is given by the displayed equation above,
and where Z^T W Z
is argument zwz
.
This is equation (7) in Geyer, et al. (2013).
The displayed equation above only defines p_W
when all of the
variance components (components of \nu
and diagonal components
of D
) are strictly positive because otherwise D
is not
invertible. In order to be lower semicontinuous, the R function returned
by this function returns Inf
whenever any component of \nu
is nonpositive, except it returns zero if b_i = 0
whenever
the corresponding diagonal element of D
is zero.
When argument standard.deviation
is TRUE
, the arguments
to the objective function are replaced by the change-of-variables
\nu = \sigma^2
and
b = A c
where the first equation works componentwise (as in R vector arithmetic),
components of the vector \sigma
are allowed to be
negative, and D = A^2
so
A = \sum_k \sigma_k E_k
When the (\alpha. c, \sigma)
variables are used,
p_W
is continuous and infinitely differentiable for all values
of these variables.
When the (\alpha. b, \nu)
variables are used,
p_W
is continuous and infinitely differentiable for all values
of \alpha
and b
but only strictly positive values
\nu
. When some components of \nu
are zero,
then subderivatives exist, but not derivatives (Geyer, et al., 2013).
returns an R function having one argument which is a numeric vector
of all the parameters and random effects in the order
(\alpha, b, \nu)
for one set of variables or
(\alpha, c, \sigma)
for other set.
The R function returned by this function evaluates the function p_W
defined in the Details section and returns a list with components named
value
, gradient
, and hessian
in case the deriv
argument is 2. The returned list has only the first two of these in case
the deriv
argument is 1.
The returned list has only the first one of these in case
the deriv
argument is 0.
In case the (\alpha, b, \nu)
parameterization is
being used and some components of \nu
are zero, the derivatives
returned are for the terms of the objective function that are differentiable.
The negative binomial and truncated negative binomial families are fundamentally
incompatible with random effects. The reason is that the canonical parameter
space for a one-parameter negative binomial or truncated negative binomial
is the negative half line. Thus the conditional canonical parameter
\theta
for such a node must be negative valued. The aster
transform is so complicated that it is unclear what the corresponding
constraint on the unconditional canonical parameter \varphi
is,
but there is a constraint: its parameter space is not the whole real line.
A normal random effect, in contrast, does have support the whole real line.
It wants to make parameters that are constrained to have any real number.
The code only warns about this situation, because if the random effects do
not influence any negative binomial or truncated negative binomial nodes
of the graph, then there would be no problem.
The Breslow-Clayton approximation assumes the complete data log likelihood is approximately quadratic considered as a function of random effects only. This will be the case by the law of large numbers if the number of individuals is much larger than the number of random effects. Thus Geyer, et al. (2013) warn against trying to put a random effect for each individual in the model. If you do that, the code will try to fit the model, but it will take forever and no theory says the results will make any sense.
Breslow, N. E., and Clayton, D. G. (1993). Approximate Inference in Generalized Linear Mixed Models. Journal of the American Statistical Association, 88, 9–25. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1080/01621459.1993.10594284")}.
Geyer, C. J., Ridley, C. E., Latta, R. G., Etterson, J. R., and Shaw, R. G. (2013) Local Adaptation and Genetic Effects on Fitness: Calculations for Exponential Family Models with Random Effects. Annals of Applied Statistics, 7, 1778–1795. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1214/13-AOAS653")}.
library(aster)
data(radish)
pred <- c(0,1,2)
fam <- c(1,3,2)
rout <- reaster(resp ~ varb + fit : (Site * Region),
list(block = ~ 0 + fit : Block, pop = ~ 0 + fit : Pop),
pred, fam, varb, id, root, data = radish)
objfun <- with(rout, objfun.factory(fixed, random, response,
obj$pred, obj$fam, as.vector(obj$root), zwz))
theta.hat <- with(rout, c(alpha, b, nu))
all.equal(objfun(theta.hat)$value, rout$deviance / 2)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.