glmmlasso: Function to fit high-dimensional generalized linear mixed...
In glmmixedlasso: Generalized Linear Mixed Models with Lasso

Description Usage Arguments Details Value Author(s) References Examples

View source: R/glmmlasso.R

Fits the solution for a high-dimensional generalized linear mixed model.

glmmlasso(y,...)

## Default S3 method:
glmmlasso(y, group, X, Z = NULL,
          family = c("binomial", "poisson"),
          covStruct = c("Identity", "Diagonal", "Sym"),
          lambda, weights = NULL, coefInit = NULL, exactStep = FALSE,
          exactDeriv = FALSE, random = NULL, unpenalized = 1, ranInd = 1,
          control = glmmlassoControl(family = family), ...)

`y`	response variable of length n.
`group`	grouping variable of length n.
`X`	fixed-effects matrix as an n x p matrix. An intercept has to be included in the first column as (1,...,1).
`Z`	random-effects matrix as an n x q matrix, must be in sparse Matrix format (see package Matrix).
`family`	GLM family. Currently only "binomial" and "poisson" are implemented.
`covStruct`	Covariance Stucture to be used. "Identity" fits \bm{Σ}_{θ}=θ^2\bm{1}_q, i.e. one single covariance parameter for all random effects. "Diagonal" fits a diagonal matrix for \bm{Σ}_{θ}, i.e. for each random effect a different covariance parameter. "Sym" fits an unstructured symmetric matrix.
`lambda`	non-negative regularization parameter
`weights`	weights for the fixed-effects covariates: NA means no penalization, 0 means drop this covariate ; if given, the argument unpenalized is ignored. By default each covariate has weight 1
`coefInit`	starting values. They must be of the same length as the model to be fitted, i.e. the number of variables p, the number of random-effects q and the number of covariance parameters stot must coincide. Otherwise a warning is issued.
`exactStep`	logical. Should the Armijo step include the update of the random effects u to ensure that the objective function strictly decreases?
`exactDeriv`	logical. Should the exact derivate be calculated or the derivative for fixed random effects u.
`random`	expression for the random-effects structure for non-correlated random effects of the form "(1\|group)+(0+X2\|group)+(0+X3\|group)". It is used only for generating the corresponding Z matrix, and it dominates the Z matrix, i.e. if random and a Z matrix is given, the Z matrix corresponding to random is used.
`unpenalized`	indices as subset of \{1,...,p\} indicating which fixed-effects covariates are not subject to the \ell_1-penalty. Ignored if weights is given.
`ranInd`	indices of the random effects as subset of \{1,...,p\}. Only used for `summary.glmmlasso`. They only need to be specified if the random-effects covariates do not corespond to the first columns in X.
`control`	control parameters for the algorithm, see `lmmlassoControl` for the details
`...`	not used.

All the details of the algorithm can be found in the article.

Concerning logLik, deviance, aic and bic, we have to be very careful when comparing them with other generalized linear mixed model functions. If we study a low-dimensional data example and set λ=0, the log-likelihood is calculated as given in the paper. Deviance, aic and bic are computed such that they coincide with the results from glmer from the lme4 package. The latter does not employ the standard definitions of deviance and log-likelihood function value. Nevertheless, the differences only depends on constants, and not the parameters.

A glmmlasso object is returned, for which coef,resid, fitted, logLik print, summary, plot methods exist.

`fixef`	fixed-effects parameter beta
`coefficients`	fixed-effects parameter beta
`theta`	covariance parameter estimates
`ranef`	random effects in sparse vector format
`u`	random effects in dense vector format
`objective`	Value of the objective function corresponding to the estimates
`logLik`	value of the log-likelihood function. See details.
`deviance`	value of the deviance function. See details.
`aic`	AIC. See details.
`bic`	BIC. See details.
`activeSet`	Indices of the non-zero fixed-effects coefficients.
`eta`	The linear predictor at the current values.
`mu`	The conditional expectation of the response at the current values.
`fitted`	The fitted values at the current values.
`lambda`	non-negative regularization parameter
`weights`	weights (possible adapted to the argument weights)
`data`	data. List with y, group, X and Z
`family`	GLM family used.
`ntot`	total number of observations
`p`	number of fixed-effects parameters
`N`	number of groups/clusters
`unpenalized`	indices of the non-penalized fixed-effects covariates
`ranInd`	indices of the random effects as subset of \{1,...,p\}.
`exactStep`	logical. If the Armijo step includes the update of the random effects u or not.
`exactDeriv`	logical. If the exact derivate has been calculated or if the derivative for fixed random effects u has been calculated.
`coefInit`	starting values used in the algorithm for beta, theta and u
`coefOut`	list with estimates in the form required for the argument coefInit
`convergence`	integer giving convergence information. Each time maxArmijo was reached, convergence is increased by 2. If maxIter was reached, convergence is increased by 1.
`nIter`	number of outer iterations.
`nIterPirls`	number of pirls evaluation within the outer iteration. Pirls-Evaluation within the Armijo steps are not counted.
`nfctEval`	number of function evaluation. See value fctEval.
`fctEval`	vector of all function values calculated during the algorithm. It may be interesting if studying the convergence behaviour of the algorithm. Only if argument fctSave=TRUE
`gradient`	gradient of the objective function with respect to the fixed-effects coefficients
`maxGrad`	maximal value of the gradient which has to be close to zero for convergence.
`maxArmijo`	the maximal value of l used in each fixed-effects component.
`control`	see `lmmlassoControl`
`resid`	response residuals. See McCullagh and Nelder (1989).
`pearResid`	pearson residuals. See McCullagh and Nelder (1989).
`respResid`	response residuals. See McCullagh and Nelder (1989).
`workResid`	working residuals. See McCullagh and Nelder (1989).
`devResid`	deviance residuals. See McCullagh and Nelder (1989).
`Var`	conditional variance of the response at the current values. See McCullagh and Nelder (1989).
`di`	contribution of each observation to the deviance. See McCullagh and Nelder (1989).
`gof`	goodness-of-fit criterion. For family="binomial", it is the in-sample missclassification rate. For family="poisson", it is the Pearson X2 statistic. See McCullagh and Nelder (1989).
`cpu`	cpu time needed for the algorithm.

Juerg Schelldorfer, schell@stat.math.ethz.ch

P. McCullagh and J. A. Nelder (1989), Generalized Linear Models, Chapman-Hall, London.

## (1) Use glmmlasso on the xerop data set
data(xerop)
fit1 <- glmmlasso(y=xerop$y,group=xerop$group,X=xerop$X,Z=xerop$Z,
                  family="binomial",covStruct="Identity",lambda=30)
summary(fit1)
plot(fit1)

## (2) Use the glmmlasso on a small simulated data set
set.seed(142)

N <- 40           ## number of groups
p <- 6            ## number of covariates (including intercept)
q <- 2            ## number of random effect covariates
ni <- rep(10,N)   ## observations per group
ntot <- sum(ni)   ## total number of observations

group <- factor(rep(1:N,times=ni)) ## grouping variable

beta <- c(0,1,-1,1,-1,0,0) # fixed-effects coefficients

X <- cbind(1,matrix(rnorm(ntot*p),nrow=ntot)) ## fixed-effects design matrix
Z <- t(glmer(rbinom(ntot,1,runif(ntot,0.3,1))~1+(1|group)+(0+X2|group),
             data=data.frame(X),family="binomial")@Zt) 
## random-eff. design matrix

bi <- c(rnorm(N,0,2),rnorm(N,0,1))

eta <- X%*%beta + Z%*%bi
mu  <- exp(eta)/(1+exp(eta))
y   <- rbinom(ntot,1,mu@x)

## random effects model with diagonal covariance matrix
fit2 <- glmmlasso(y=y, group=group, X=X, Z=Z, family="binomial", lambda=10,
                  unpenalized=1:2, covStruct="Diagonal")
summary(fit2)
plot(fit2)