gammi | R Documentation |
Fits generalized additive models (GAMs) and generalized additive mixed model (GAMMs) using lme4 as the tuning engine. Predictor groups can be manually input (default S3 method) or inferred from the model (S3 "formula" method). Smoothing parameters are treated as variance components and estimated using REML/ML (gaussian) or Laplace approximation to ML (others).
gammi(x, ...)
## Default S3 method:
gammi(x,
y,
group,
family = gaussian,
fixed = NULL,
random = NULL,
data = NULL,
REML = TRUE,
control = NULL,
start = NULL,
verbose = 0L,
nAGQ = 10L,
subset,
weights,
na.action,
offset,
mustart,
etastart,
...)
## S3 method for class 'formula'
gammi(formula,
data,
family = gaussian,
fixed = NULL,
random = NULL,
REML = TRUE,
control = NULL,
start = NULL,
verbose = 0L,
nAGQ = 10L,
subset,
weights,
na.action,
offset,
mustart,
etastart,
...)
x |
Model (design) matrix of dimension |
y |
Response vector of length |
group |
Group label vector (factor, character, or integer) of length |
formula |
Model formula: a symbolic description of the model to be fitted. Uses the same syntax as |
family |
Assumed exponential |
fixed |
For default method: a character vector specifying which |
random |
A one-sided formula specifying the random effects structure using lme4 syntax. See Note. |
data |
Optional data frame containing the variables referenced in |
REML |
Logical indicating whether REML versus ML should be used to tune the smoothing parameters and variance components. |
control |
List containing the control parameters (output from |
start |
List (with names) of starting parameter values for model parameters. |
verbose |
Postive integer that controls the level of output displayed during optimization. |
nAGQ |
Numer of adaptive Gaussian quadrature points. Only used for non-Gaussian responses with a single variance component. |
subset |
Optional expression indicating the subset of rows to use for the fitting (defaults to all rows). |
weights |
Optional vector indicating prior observations weights for the fitting (defaults to all ones). |
na.action |
Function that indicates how |
offset |
Optional vector indicating each observation's offset for the fitting (defaults to all zeros). |
mustart |
Optional starting values for the mean (fitted values). |
etastart |
Optional starting values for the linear predictors. |
... |
Optional arguments passed to the |
Fits a generalized additive mixed model (GAMM) of the form
g(\mu) = f(\mathbf{X}, \mathbf{Z}) + \mathbf{X}^\top \boldsymbol\beta + \mathbf{Z}^\top \boldsymbol\alpha
where
\mu = E(Y | \mathbf{X}, \mathbf{Z})
is the conditional expectation of the response Y
given the predictor vectors \mathbf{X} = (X_1, \ldots, X_p)^\top
and \mathbf{Z} = (Z_1, \ldots, Z_q)^\top
the function g(\cdot)
is a user-specified (invertible) link function
the function f(\cdot)
is an unknown smooth function of the predictors (specified by formula
)
the vector \mathbf{X}
is the fixed effects component of the design (specified by fixed
)
the vector \mathbf{Z}
is the random effects component of the design (specified by random
)
the vector \boldsymbol\beta
contains the unknown fixed effects coefficients
the vector \boldsymbol\alpha
contains the unknown Gaussian random effects
Note that the mean function f(\cdot)
can include main and/or interaction effects between any number of predictors. Furthermore, note that the fixed effects in \mathbf{X}^\top \boldsymbol\beta
and the random effects in \mathbf{Z}^\top \boldsymbol\alpha
are both optional.
An object of class "gammi"
with the following elements:
fitted.values |
model predictions on the data scale |
linear.predictors |
model predictions on the link scale |
coefficients |
coefficients used to make the predictions |
random.coefficients |
coefficients corresponding to the |
term.labels |
labels for the terms included in the |
dispersion |
estimated dispersion parameter = |
vcovchol |
Cholesky factor of covariance matrix such that |
family |
exponential family distribution (same as input) |
logLik |
log-likelihood for the solution |
aic |
AIC for the solution |
deviance |
model deviance, i.e., two times the negative log-likelihood |
null.deviance |
deviance of the null model, i.e., intercept only. Will be |
r.squared |
proportion of null deviance explained = |
nobs |
number of observations used in fit |
leverages |
leverage scores for each observation |
edf |
effective degrees of freedom = |
df.random |
degress of freedom corresponding to |
df.residual |
residual degrees of freedom = |
x |
input |
group |
character vector indicating which columns of |
scale |
numeric vector giving the scale parameter used to z-score each term's data |
fixed |
fixed effects terms (default method) or formula (formula method); will be |
random |
random effects formula |
mer |
object of class |
VarCorr |
data frame with variance and covariance parameter estimates from |
call |
function call |
data |
input data |
contrasts |
list of contrasts applied to |
spline.info |
list of spline parameters for terms in |
formula |
input model formula |
The random
argument uses standard lmer syntax:
(1 | g)
for a random intercept for each level of g
(1 | g1) + (1 | g2)
for random intercepts for g1 and g2
(1 | g1/g2) = (1 | g1) + (1 | g1:g2)
for random intercepts for g1 and g2 nested within g1
(x | g) = (1 + x | g)
for a correlated random intercept and slope of x for each level of g
(x || g) = (1 | g) + (0 + x | g)
for an uncorrelated random intercept and slope of x for each level of g
For stable computation, any terms entered through x
(default method) or formula
and/or fixed
(formula method) are z-scored prior to fitting the model. Note that terms entered through random
are not standardized.
The "mer"
component of the output contains the model fitting results for a z-scored version of the original data (i.e., this fit is on a different scale). Consequently, the "mer"
component should not be used for prediction and/or inference purposes. All prediction and inference should be conducted using the plot, predict, and summary methods mentioned in the ‘See Also’ section.
The "VarCorr"
component contains the estimated variance/covariance parameters transformed back to the original scale.
The model R-squared is the proportion of the null deviance that is explained by the model, i.e.,
r.squared = 1 - deviance / null.deviance
where deviance
is the deviance of the model, and null.deviance
is the deviance of the null model.
When the random
argument is used, null.deviance
and r.squared
will be NA
. This is because there is not an obvious null model when random effects are included, e.g., should the null model include or exclude the random effects? Assuming that is it possible to define a reasonable null.deviance
in such cases, the above formula can be applied to calculate the model R-squared for models that contain random effects.
Nathaniel E. Helwig <helwig@umn.edu>
Bates, D., Maechler, M., Bolker, B., & Walker, S. (2015). Fitting linear mixed-effects models using lme4. Journal of Statistical Software, 67(1), 1–48. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.18637/jss.v067.i01")}
Helwig, N. E. (2024). Precise tensor product smoothing via spectral splines. Stats, 7(1), 34-53, \Sexpr[results=rd]{tools:::Rd_expr_doi("10.3390/stats7010003")}
plot.gammi
for plotting effects from gammi
objects
predict.gammi
for predicting from gammi
objects
summary.gammi
for summarizing results from gammi
objects
##############***############## EXAM EXAMPLE ##############***##############
# load 'gammi' package
library(gammi)
# load 'exam' help file
?exam
# load data
data(exam)
# header of data
head(exam)
# fit model
mod <- gammi(Exam.score ~ VRQ.score, data = exam,
random = ~ (1 | Primary.school) + (1 | Secondary.school))
# plot results
plot(mod)
# summarize results
summary(mod)
#############***############# GAUSSIAN EXAMPLE #############***#############
#~~~Example 1: Single Predictor ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# generate data
n <- 1000
x <- seq(0, 1, length.out = n)
fx <- sin(2 * pi * x)
set.seed(1)
y <- fx + rnorm(n)
# fit model via formula method
mod <- gammi(y ~ x)
mod
# fit model via default method
modmat <- spline.model.matrix(y ~ 0 + x)
tlabels <- attr(modmat, "term.labels")
tassign <- attr(modmat, "assign")
g <- factor(tlabels[tassign], levels = tlabels)
mod0 <- gammi(modmat, y, g)
mod0
# summarize fit model
summary(mod)
# plot function estimate
plot(mod)
#~~~Example 2: Additive Model ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# mean function
eta <- function(x, z, additive = TRUE){
mx1 <- cos(2 * pi * (x - pi))
mx2 <- 30 * (z - 0.6)^5
mx12 <- 0
if(!additive) mx12 <- sin(pi * (x - z))
mx1 + mx2 + mx12
}
# generate data
set.seed(1)
n <- 1000
x <- runif(n)
z <- runif(n)
fx <- eta(x, z)
y <- fx + rnorm(n)
# fit model via formula method
mod <- gammi(y ~ x + z)
mod
# fit model via default method
modmat <- spline.model.matrix(y ~ 0 + x + z)
tlabels <- attr(modmat, "term.labels")
tassign <- attr(modmat, "assign")
g <- factor(tlabels[tassign], levels = tlabels)
mod0 <- gammi(modmat, y, g)
mod0
# summarize fit model
summary(mod)
# plot function estimate
plot(mod)
#~~~Example 3: Interaction Model ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# mean function
eta <- function(x, z, additive = TRUE){
mx1 <- cos(2 * pi * (x - pi))
mx2 <- 30 * (z - 0.6)^5
mx12 <- 0
if(!additive) mx12 <- sin(pi * (x - z))
mx1 + mx2 + mx12
}
# generate data
set.seed(1)
n <- 1000
x <- runif(n)
z <- runif(n)
fx <- eta(x, z, additive = FALSE)
y <- fx + rnorm(n)
# fit model via formula method
mod <- gammi(y ~ x * z)
mod
# fit model via default method
modmat <- spline.model.matrix(y ~ 0 + x * z)
tlabels <- attr(modmat, "term.labels")
tassign <- attr(modmat, "assign")
g <- factor(tlabels[tassign], levels = tlabels)
mod0 <- gammi(modmat, y, g)
mod0
# summarize fit model
summary(mod)
# plot function estimate
plot(mod)
#~~~Example 4: Random Intercept ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# mean function
eta <- function(x, z, additive = TRUE){
mx1 <- cos(2 * pi * (x - pi))
mx2 <- 30 * (z - 0.6)^5
mx12 <- 0
if(!additive) mx12 <- sin(pi * (x - z))
mx1 + mx2 + mx12
}
# generate mean function
set.seed(1)
n <- 1000
nsub <- 50
x <- runif(n)
z <- runif(n)
fx <- eta(x, z)
# generate random intercepts
subid <- factor(rep(paste0("sub", 1:nsub), n / nsub),
levels = paste0("sub", 1:nsub))
u <- rnorm(nsub, sd = sqrt(1/2))
# generate responses
y <- fx + u[subid] + rnorm(n, sd = sqrt(1/2))
# fit model via formula method
mod <- gammi(y ~ x + z, random = ~ (1 | subid))
mod
# fit model via default method
modmat <- spline.model.matrix(y ~ 0 + x + z)
tlabels <- attr(modmat, "term.labels")
tassign <- attr(modmat, "assign")
g <- factor(tlabels[tassign], levels = tlabels)
mod0 <- gammi(modmat, y, g, random = ~ (1 | subid))
mod0
# summarize fit model
summary(mod)
# plot function estimate
plot(mod)
#############***############# BINOMIAL EXAMPLE #############***#############
#~~~Example 1: Single Predictor ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# generate data
n <- 1000
x <- seq(0, 1, length.out = n)
fx <- sin(2 * pi * x)
set.seed(1)
y <- rbinom(n = n, size = 1, prob = 1 / (1 + exp(-fx)))
# fit model
mod <- gammi(y ~ x, family = binomial)
mod
# summarize fit model
summary(mod)
# plot function estimate
plot(mod)
#~~~Example 2: Additive Model ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# mean function
eta <- function(x, z, additive = TRUE){
mx1 <- cos(2 * pi * (x - pi))
mx2 <- 30 * (z - 0.6)^5
mx12 <- 0
if(!additive) mx12 <- sin(pi * (x - z))
mx1 + mx2 + mx12
}
# generate data
set.seed(1)
n <- 1000
x <- runif(n)
z <- runif(n)
fx <- 1 + eta(x, z)
y <- rbinom(n = n, size = 1, prob = 1 / (1 + exp(-fx)))
# fit model
mod <- gammi(y ~ x + z, family = binomial)
mod
# summarize fit model
summary(mod)
# plot function estimate
plot(mod)
#~~~Example 3: Interaction Model ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# mean function
eta <- function(x, z, additive = TRUE){
mx1 <- cos(2 * pi * (x - pi))
mx2 <- 30 * (z - 0.6)^5
mx12 <- 0
if(!additive) mx12 <- sin(pi * (x - z))
mx1 + mx2 + mx12
}
# generate data
set.seed(1)
n <- 1000
x <- runif(n)
z <- runif(n)
fx <- eta(x, z, additive = FALSE)
y <- rbinom(n = n, size = 1, prob = 1 / (1 + exp(-fx)))
# fit model
mod <- gammi(y ~ x * z, family = binomial)
mod
# summarize fit model
summary(mod)
# plot function estimate
plot(mod)
#~~~Example 4: Random Intercept ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# mean function
eta <- function(x, z, additive = TRUE){
mx1 <- cos(2 * pi * (x - pi))
mx2 <- 30 * (z - 0.6)^5
mx12 <- 0
if(!additive) mx12 <- sin(pi * (x - z))
mx1 + mx2 + mx12
}
# generate mean function
set.seed(1)
n <- 1000
nsub <- 50
x <- runif(n)
z <- runif(n)
fx <- 1 + eta(x, z)
# generate random intercepts
subid <- factor(rep(paste0("sub", 1:nsub), n / nsub),
levels = paste0("sub", 1:nsub))
u <- rnorm(nsub, sd = sqrt(1/2))
# generate responses
y <- rbinom(n = n, size = 1, prob = 1 / (1 + exp(-(fx+u[subid]))))
# fit model
mod <- gammi(y ~ x + z, random = ~ (1 | subid), family = binomial)
mod
# summarize fit model
summary(mod)
# plot function estimate
plot(mod)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.