Description Usage Arguments Details Algorithm Author(s)
gtGlm
is used to fit generalized linear models, specified by
giving a symbolic description of the linear predictor and a
description of the error distribution. The syntax is modeled after
that of glm
.
For more information regarding the helper functions, see the
description of Make
and Argument
functions in the
general documentation on the Grokit system.
1 2 3 4 5 6 7 8 | glm.data(data, ..., outputs = result, force.frame = FALSE)
GLM(data, ..., model = NULL, outputs = result)
GLMMake(formula, family = gaussian, weights = NULL, start = NULL,
eta.start = NULL, mu.start = NULL, offset = NULL,
maxit = 25, epsilon = 1e-8, trace = FALSE, debug = FALSE,
convergence = "relative", model = TRUE, ...)
|
data |
an object of class |
formula |
an object of class |
family |
the expected error distribution of the data and the link
function. This argument can be a character string naming a family
(e.g. |
weights |
an optional description of ‘prior weights’ to
be used. Unlike |
start |
an optional vector that specifies starting values for the parameters in the linear predictor. |
etastart |
an optional specification of starting values for the linear
predictor. The required format is equivalent to that of
|
mustart |
an optional specification of starting values for the
predicted means. The required format is equivalent to that of
|
offset |
an optional specification of a known component to be
added to the linear predictor during fitting. The required
format is equivalent to that of |
maxit |
the maximal number of IWLS iterations to be used. In
|
epsilon |
the maximum absolute change of parameters allowed for convergence. See section ‘Algorithm’ for more information. |
trace |
logical indicating whether every iteration of the coefficient vector should be returned. |
debug |
logical indicating whether sample calculations should be outputed to the terminal. |
convergence |
the type of convergence to be tested for, either |
A typical formula has the form response ~ terms
where both
response
and terms
are expressions
with
the additional function I
allowed. Additionally,
terms
is allowed the additional binary operator :
. See
the ‘Details’ of glm
for more information about
I
, :
, +
, and *
.
Unlike glm
, a binomial model is specified exactly the same as
other models. To specify number of trials, simply include them in the
weights
, which is easily accomplished because weights
is
allowed to be a mathematical expression. Furthermore, the response
must be the proportion of successes.
For example, let S
denote a vector of number of successes;
F
, a vector of number of failures; W
, a vector of
weights. In the implementation of glm
, this could be called as
glm(cbind(S, F) ~ [formula], family = binomial, weights =
W)
. To form an equivalent model using gtGlm
, use
gtGlm(S/(S + F) ~ [formula], family = binomial, weights = (F +
S) * W)
. Here, [formula]
represents an arbitrary formula of
covariates.
As the algorithm for maximimizing the log likelihood is different from
that of glm
, a brief outline is given:
1. η_i = \textbf{x}_i \cdot \textbf{β}_j - The linear predictor is the dot product of the input vector (which is formed from the formula and the data) and the current iteration of the coefficient vector.
2. \hat{μ}_i = g^{-1}≤ft(η_i\right) - the predict mean is the inverse link function of the linear predictor.
3. z_i = η_i + ≤ft(y_i - \hat{μ}_i \right) \cdot \frac{dη_i}{d μ_i} - The working dependent variable is computed.
4. w_i = \frac{p_i}{\text{var}≤ft(μ_i\right) \cdot
≤ft(\frac{dη_i}{d μ_i}\right)^2} - the iterative weight
calculated from the prior weights (the weights
argument) and
the variance function specified by the family.
5. \textbf{X}^\textbf{T} \textbf{WX} \stackrel{+}{=} w_i \cdot \textbf{x}_i^\textbf{T} \textbf{x} - The weights matrix is updated for that item.
6. \textbf{X}^\textbf{T} \textbf{Wz} \stackrel{+}{=} w_i \cdot z_i \cdot \textbf{x}_i - The response matrix is updated for that item.
7. \boldsymbol β_{j+1} = ≤ft( \textbf{X}^\textbf{T} \textbf{WX} \right)^{-1} \cdot \textbf{X}^\textbf{T} \textbf{Wz} - the next iteration of the coefficient vector is computed. Note: ( \cdot )^{-1} denotes the Moore-Penrose psuedoinverse, not the standard matrix inverse.
8. If \max\limits_i ≤ft|β_j[i] - β_{j+1}[i]\right| < ε, then the IWLS has converged and deviance is calculated. Otherwise, another iteration is performed.
Jon Claus, <jonterainsights@gmail.com>, Tera Insights LLC
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.