Description Usage Arguments Details Value Author(s) References See Also Examples
hglm
is used to fit hierarchical generalized linear models. It can be used for linear mixed models and
generalized linear models with random effects for a variety of links and a variety of distributions for both
the outcomes and the random effects. Fixed effects can also be fitted in the dispersion part of the model.
The function can be called either by specifying the design matrices or as a formula
.
1 2 3 4 5 6 7 8 | hglm(X = NULL, y = NULL, Z = NULL, family = gaussian(link = identity),
rand.family = gaussian(link = identity), method = "EQL",
conv = 1e-6, maxit = 50, startval = NULL, fixed = NULL,
random = NULL, X.disp = NULL, disp = NULL, link.disp = "log",
X.rand.disp = NULL, rand.disp = NULL, link.rand.disp = "log",
data = NULL, weights = NULL, fix.disp = NULL, offset = NULL,
RandC = ncol(Z), sparse = TRUE, vcovmat = FALSE,
calc.like = FALSE, bigRR = FALSE, verbose = FALSE, ...)
|
X |
|
y |
|
Z |
|
family |
|
rand.family |
|
method |
|
conv |
|
maxit |
|
startval |
|
fixed |
|
random |
|
X.disp |
|
disp |
|
link.disp |
|
X.rand.disp |
|
rand.disp |
|
link.rand.disp |
|
data |
|
weights |
|
fix.disp |
|
offset |
An offset for the linear predictor of the mean model. |
RandC |
|
sparse |
|
vcovmat |
|
calc.like |
logical. If |
bigRR |
logical. If |
verbose |
logical. If |
... |
not used. |
Models for hglm
are either specified symbolically using formula
or by specifying the design matrices ( X
, Z
and X.disp
). The extended quasi likelihood (EQL)
method is the default method for estimation of the model parameters. For the Gaussian-Gaussian linear mixed models, it
is REML. It should be noted that the EQL estimator can be biased and inconsistent in some special cases e.g. binary pair matched response. A higher order correction
might be useful to correct the bias of EQL (Lee et al. 2006). There is also an EQL1
option, which improves estimation for GLMMs (especially for Poisson models with a large number of levels in the random effects). The EQL1
method computes estimates by adjusting the working response as described in the appendix of Lee and Lee (2012).
By default, the dispersion
parameter is estimated by the hglm
and hglm2
functions. If the dispersion parameter of the mean model is to be held constant, for example if it is
desired to be 1 for binomial and Poisson family, then fix.disp
=value where, value=1 for the above example, should be used.
Interpretation of warning messages
Remove all NA before input to the hglm function.
- This message is important and tells the user to delete all lines with missing values from the input data.
Residuals numerically 0 are replaced by 1e-8. or
Hat-values numerically 1 are replaced by 1 - 1e-8.
- These messages are often not important as they usually reflect a numerical issue in an intermediate step of the iterative fitting algorithm. However, it is a good idea to check that there are no hat values equal to 1 in the final output.
It returns an object of class hglm
consiting of the following values.
fixef |
fixed effect estimates. |
ranef |
random effect estimates. |
RandC |
integers (possibly a vector) specified the number of column of Z to be used for each of the random-effect terms. |
varFix |
dispersion parameter of the mean model (residual variance for LMM). |
varRanef |
dispersion parameter of the random effects (variance of random effects for GLMM). |
CAR.rho |
parameter estimate for a MRF spatial model. |
CAR.tau |
parameter estimate for a MRF spatial model. |
iter |
number of iterations used. |
Converge |
specifies if the algorithm converged. |
SeFe |
standard errors of fixed effects. |
SeRe |
standard errors of random effects. |
dfReFe |
deviance degrees of freedom for the mean part of the model. |
SummVC1 |
estimates and standard errors of the linear predictor in the dispersion model. |
SummVC2 |
estimates and standard errors of the linear predictor for the dispersion parameter of the random effects. |
dev |
individual deviances for the mean part of the model. |
hv |
hatvalues for the mean part of the model. |
resid |
studentized residuals for the mean part of the model. |
fv |
fitted values for the mean part of the model. |
disp.fv |
fitted values for the dispersion part of the model. |
disp.resid |
standardized deviance residuals for the dispersion part of the model. |
link.disp |
link function for the dispersion part of the model. |
vcov |
the variance-covariance matrix. |
likelihood |
a list of log-likelihood values for model selection purposes, where |
bad |
the index of the influential observation. |
Moudud Alam, Lars Ronnegard, Xia Shen
Lars Ronnegard, Xia Shen and Moudud Alam (2010). hglm: A Package for Fitting Hierarchical Generalized Linear Models. The R Journal, 2(2), 20-28.
Youngjo Lee, John A Nelder and Yudi Pawitan (2006) Generalized Linear Models with Random Effect: a unified analysis via h-likelihood. Chapman and Hall/CRC.
Xia Shen, Moudud Alam, Freddy Fikse and Lars Ronnegard (2013). A novel generalized ridge regression method for quantitative genetics. Genetics 193(4), ?1255-1268.
Moudud Alam, Lars Ronnegard, Xia Shen (2014). Fitting conditional and simultaneous autoregressive spatial models in hglm. Submitted.
Woojoo Lee and Youngjo Lee (2012). Modifications of REML algorithm for hglms. Statistics and Computing 22, 959-966.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 | # Find more examples and instructions in the package vignette:
# vignette('hglm')
require(hglm)
# --------------------- #
# semiconductor example #
# --------------------- #
data(semiconductor)
m11 <- hglm(fixed = y ~ x1 + x3 + x5 + x6,
random = ~ 1|Device,
family = Gamma(link = log),
disp = ~ x2 + x3, data = semiconductor)
summary(m11)
plot(m11, cex = .6, pch = 1,
cex.axis = 1/.6, cex.lab = 1/.6,
cex.main = 1/.6, mar = c(3, 4.5, 0, 1.5))
# ------------------- #
# redo it using hglm2 #
# ------------------- #
m12 <- hglm2(y ~ x1 + x3 + x5 + x6 + (1|Device),
family = Gamma(link = log),
disp = ~ x2 + x3, data = semiconductor)
summary(m12)
# -------------------------- #
# redo it using matrix input #
# -------------------------- #
attach(semiconductor)
m13 <- hglm(y = y, X = model.matrix(~ x1 + x3 + x5 + x6),
Z = kronecker(diag(16), rep(1, 4)),
X.disp = model.matrix(~ x2 + x3),
family = Gamma(link = log))
summary(m13)
# --------------------- #
# verbose & likelihoods #
# --------------------- #
m14 <- hglm(fixed = y ~ x1 + x3 + x5 + x6,
random = ~ 1|Device,
family = Gamma(link = log),
disp = ~ x2 + x3, data = semiconductor,
verbose = TRUE, calc.like = TRUE)
summary(m14)
# --------------------------------------------- #
# simulated example with 2 random effects terms #
# --------------------------------------------- #
## Not run:
set.seed(911)
x1 <- rnorm(100)
x2 <- rnorm(100)
x3 <- rnorm(100)
z1 <- factor(rep(LETTERS[1:10], rep(10, 10)))
z2 <- factor(rep(letters[1:5], rep(20, 5)))
Z1 <- model.matrix(~ 0 + z1)
Z2 <- model.matrix(~ 0 + z2)
u1 <- rnorm(10, 0, sqrt(2))
u2 <- rnorm(5, 0, sqrt(3))
y <- 1 + 2*x1 + 3*x2 + Z1%*%u1 + Z2%*%u2 + rnorm(100, 0, sqrt(exp(x3)))
dd <- data.frame(x1 = x1, x2 = x2, x3 = x3, z1 = z1, z2 = z2, y = y)
(m21 <- hglm(X = cbind(rep(1, 100), x1, x2), y = y, Z = cbind(Z1, Z2),
RandC = c(10, 5)))
summary(m21)
plot(m21)
# m21 is the same as:
(m21b <- hglm(X = cbind(rep(1, 100), x1, x2), y = y, Z = cbind(Z1, Z2),
rand.family = list(gaussian(), gaussian()), RandC = c(10, 5))
(m22 <- hglm2(y ~ x1 + x2 + (1|z1) + (1|z2), data = dd, vcovmat = TRUE))
image(m22$vcov, main = 'Variance-covariance Matrix')
summary(m22)
plot(m22)
m31 <- hglm2(y ~ x1 + x2 + (1|z1) + (1|z2), disp = ~ x3, data = dd)
print (m31)
summary(m31)
plot(m31)
# ------------------------------- #
# Markov random field (MRF) model #
# ------------------------------- #
data(cancer)
logE <- log(E)
X11 <- model.matrix(~Paff)
m41 <- hglm(X = X11, y = O, Z = diag(length(O)),
family = poisson(), rand.family = CAR(D = nbr),
offset = logE, conv = 1e-9, maxit = 200, fix.disp = 1)
summary(m41)
data(ohio)
m42 <- hglm(fixed = MedianScore ~ 1,
random = ~ 1 | district,
rand.family = CAR(D = ohioDistrictDistMat),
data = ohioMedian)
summary(m42)
require(sp)
districtShape <- as.numeric(substr(as.character(ohioShape@data$UNSDIDFP), 3, 7))
CARfit <- matrix(m42$ranef + m42$fixef, dimnames = list(rownames(ohioDistrictDistMat), NULL))
ohioShape@data$CAR <- CARfit[as.character(districtShape),]
ohioShape@data$CAR[353] <- NA # remove estimate of Lake Erie
spplot(ohioShape, zcol = "CAR", main = "Fitted values from CAR",
col.regions = heat.colors(1000)[1000:1], cuts = 1000)
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.