cgamm: Constrained Generalized Additive Mixed-Effects Model Fitting
In cgam: Constrained Generalized Additive Model

cgamm

R Documentation

Constrained Generalized Additive Mixed-Effects Model Fitting

Description

This routine is an addition to the main routine cgam in this package. A random-intercept component is included in a cgam model.

Usage

cgamm(formula, nsim = 0, family = gaussian(), cpar = 1.2, data = NULL, weights = NULL,
sc_x = FALSE, sc_y = FALSE, bisect = TRUE, reml = TRUE, nAGQ = 1L)

Arguments

`formula`	A formula object which gives a symbolic description of the model to be fitted. It has the form "response ~ predictor + (1\|id)", where id is the label for a group effect. For now, only gaussian responses are considered and this routine only includes a random-intercept effect. See `cgam` for more details.
`nsim`	The number of simulations used to get the cic parameter. The default is nsim = 0.
`family`	A parameter indicating the error distribution and link function to be used in the model. For now, the options are family = gaussian() and family = binomial().
`cpar`	A multiplier to estimate the model variance, which is defined as `\sigma^2 = SSR / (n - cpar * edf)`. SSR is the sum of squared residuals for the full model and edf is the effective degrees of freedom. The default is cpar = 1.2. The user-defined value must be between 1 and 2. See Meyer, M. C. and M. Woodroofe (2000) for more details.
`data`	An optional data frame, list or environment containing the variables in the model. The default is data = NULL.
`weights`	An optional non-negative vector of "replicate weights" which has the same length as the response vector. If weights are not given, all weights are taken to equal 1. The default is weights = NULL.
`sc_x`	Logical flag indicating if or not continuous predictors are normalized. The default is sc_x = FALSE.
`sc_y`	Logical flag indicating if or not the response variable is normalized. The default is sc_y = FALSE.
`bisect`	If bisect = TRUE, a 95 percent confidence interval will be found for the variance ratio parameter by a bisection method.
`reml`	If reml = TRUE, restricted maximum likelihood (REML) method will be used to find estimates instead of maximum likelihood estimation (MLE).
`nAGQ`	Integer scalar - the number of points per axis for evaluating the adaptive Gauss-Hermite approximation to the log-likelihood. Defaults to 1, corresponding to the Laplace approximation. Values greater than 1 produce greater accuracy in the evaluation of the log-likelihood at the expense of speed.

Value

`muhat`	The fitted fixed-effect term.
`ahat`	A vector of estimated random-effect terms.
`sig2hat`	Estimate of the variance (`\sigma^2`) of between-cluster error terms.
`siga2hat`	Estimate of the variance (`\sigma_a^2`) of within-cluster error terms.
`thhat`	Estimate of the ratio (`\theta`) of two variances.
`pv.siga2`	`p`-value of the test `H_0: \sigma_a^2=0`
`ci.siga2`	95 percent confidence interval for the variance of within-cluster error terms.
`ci.th`	95 percent confidence interval for ratio of two variances.
`ci.rho`	95 percent confidence interval for intraclass correlation coefficient.
`ci.sig2`	95 percent confidence interval for the variance of between-cluster error terms.
`call`	The matched call.

Author(s)

Xiyue Liao

Examples

# Example 1 (family = gaussian).

# simulate a balanced data set with 30 clusters
# each cluster has 30 data points
	n <- 30
	m <- 30

# the standard deviation of between cluster error terms is 1
# the standard deviation of within cluster error terms is 2
	sige <- 1
	siga <- 2

# generate a continuous predictor
	x <- 1:(m*n)
	for(i in 1:m) {
		x[(n*(i-1)+1):(n*i)] <- round(runif(n), 3)
	}
# generate a group factor
	group <- trunc(0:((m*n)-1)/n)+1

# generate the fixed-effect term
	mu <- 10*exp(10*x-5)/(1+exp(10*x-5))

# generate the random-intercept term asscosiated with each group
	avals <- rnorm(m, 0, siga)

# generate the response
	y <- 1:(m*n)
	for(i in 1:m){
		y[group == i] <- mu[group == i] + avals[i] + rnorm(n, 0, sige)
	}

# use REML method to fit the model
	ans <- cgamm(y ~ s.incr(x) + (1|group), reml=TRUE)
	summary(ans)
	anova(ans)

	muhat <- ans$muhat
	plot(x, y, col = group, cex = .6)
	lines(sort(x), mu[order(x)], lwd = 2)
	lines(sort(x), muhat[order(x)], col = 2, lty = 2, lwd = 2)
	legend("topleft", bty = "n", c("true fixed-effect term", "cgamm fit"),
	col = c(1, 2), lty = c(1, 2), lwd = c(2, 2))

# Example 2 (family = binomial).
# simulate a balanced data set with 20 clusters
# each cluster has 20 data points

  n <- 20
  m <- 20#
  N <- n*m

  # siga is the sd for the random intercept
  siga <- 1

# generate a group factor
  group <- trunc(0:((m*n)-1)/n)+1
  group <- factor(group)

# generate the random-intercept term asscosiated with each group
  avals <- rnorm(m,0,siga)

# generate the fixed-effect mean term: mu, systematic term: eta and the response: y
  x <- runif(m*n)
  mu <- 1:(m*n)
  y <- 1:(m*n)

  eta <- 2 * (1 + tanh(7 * (x - .8))) - 2
  eta0 <- eta
  for(i in 1:m){eta[group == i] <- eta[group == i] + avals[i]}
  for(i in 1:m){mu[group == i] <- 1 - 1 / (1 + exp(eta[group == i]))}
  for(i in 1:m){y[group == i] <- rbinom(n, size = 1, prob = mu[group == i])}
  dat <- data.frame(x = x, y = y, group = group)
  ansc <- cgamm(y ~ s.incr.conv(x) + (1|group),
  family = binomial(link = "logit"), reml = FALSE, data = dat)
  summary(ansc)
  anova(ansc)

cgam documentation built on June 8, 2025, 1:45 p.m.