Description Usage Arguments Details Algorithm Author(s)
gtGlm is used to fit generalized linear models, specified by
giving a symbolic description of the linear predictor and a
description of the error distribution. The syntax is modeled after
that of glm.
For more information regarding the helper functions, see the
description of Make and Argument functions in the
general documentation on the Grokit system.
1 2 3 4 5 6 7 8 | glm.data(data, ..., outputs = result, force.frame = FALSE)
GLM(data, ..., model = NULL, outputs = result)
GLMMake(formula, family = gaussian, weights = NULL, start = NULL,
eta.start = NULL, mu.start = NULL, offset = NULL,
maxit = 25, epsilon = 1e-8, trace = FALSE, debug = FALSE,
convergence = "relative", model = TRUE, ...)
|
data |
an object of class |
formula |
an object of class |
family |
the expected error distribution of the data and the link
function. This argument can be a character string naming a family
(e.g. |
weights |
an optional description of ‘prior weights’ to
be used. Unlike |
start |
an optional vector that specifies starting values for the parameters in the linear predictor. |
etastart |
an optional specification of starting values for the linear
predictor. The required format is equivalent to that of
|
mustart |
an optional specification of starting values for the
predicted means. The required format is equivalent to that of
|
offset |
an optional specification of a known component to be
added to the linear predictor during fitting. The required
format is equivalent to that of |
maxit |
the maximal number of IWLS iterations to be used. In
|
epsilon |
the maximum absolute change of parameters allowed for convergence. See section ‘Algorithm’ for more information. |
trace |
logical indicating whether every iteration of the coefficient vector should be returned. |
debug |
logical indicating whether sample calculations should be outputed to the terminal. |
convergence |
the type of convergence to be tested for, either |
A typical formula has the form response ~ terms where both
response and terms are expressions with
the additional function I allowed. Additionally,
terms is allowed the additional binary operator :. See
the ‘Details’ of glm for more information about
I, :, +, and *.
Unlike glm, a binomial model is specified exactly the same as
other models. To specify number of trials, simply include them in the
weights, which is easily accomplished because weights is
allowed to be a mathematical expression. Furthermore, the response
must be the proportion of successes.
For example, let S denote a vector of number of successes;
F, a vector of number of failures; W, a vector of
weights. In the implementation of glm, this could be called as
glm(cbind(S, F) ~ [formula], family = binomial, weights =
W). To form an equivalent model using gtGlm, use
gtGlm(S/(S + F) ~ [formula], family = binomial, weights = (F +
S) * W). Here, [formula] represents an arbitrary formula of
covariates.
As the algorithm for maximimizing the log likelihood is different from
that of glm, a brief outline is given:
1. η_i = \textbf{x}_i \cdot \textbf{β}_j - The linear predictor is the dot product of the input vector (which is formed from the formula and the data) and the current iteration of the coefficient vector.
2. \hat{μ}_i = g^{-1}≤ft(η_i\right) - the predict mean is the inverse link function of the linear predictor.
3. z_i = η_i + ≤ft(y_i - \hat{μ}_i \right) \cdot \frac{dη_i}{d μ_i} - The working dependent variable is computed.
4. w_i = \frac{p_i}{\text{var}≤ft(μ_i\right) \cdot
≤ft(\frac{dη_i}{d μ_i}\right)^2} - the iterative weight
calculated from the prior weights (the weights argument) and
the variance function specified by the family.
5. \textbf{X}^\textbf{T} \textbf{WX} \stackrel{+}{=} w_i \cdot \textbf{x}_i^\textbf{T} \textbf{x} - The weights matrix is updated for that item.
6. \textbf{X}^\textbf{T} \textbf{Wz} \stackrel{+}{=} w_i \cdot z_i \cdot \textbf{x}_i - The response matrix is updated for that item.
7. \boldsymbol β_{j+1} = ≤ft( \textbf{X}^\textbf{T} \textbf{WX} \right)^{-1} \cdot \textbf{X}^\textbf{T} \textbf{Wz} - the next iteration of the coefficient vector is computed. Note: ( \cdot )^{-1} denotes the Moore-Penrose psuedoinverse, not the standard matrix inverse.
8. If \max\limits_i ≤ft|β_j[i] - β_{j+1}[i]\right| < ε, then the IWLS has converged and deviance is calculated. Otherwise, another iteration is performed.
Jon Claus, <jonterainsights@gmail.com>, Tera Insights LLC
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.