## Function to fit high-dimensional generalized linear mixed models.

### Description

Fits the solution for a high-dimensional generalized linear mixed model.

### Usage

 1 2 3 4 5 6 7 8 9 glmmlasso(y,...) ## Default S3 method: glmmlasso(y, group, X, Z = NULL, family = c("binomial", "poisson"), covStruct = c("Identity", "Diagonal", "Sym"), lambda, weights = NULL, coefInit = NULL, exactStep = FALSE, exactDeriv = FALSE, random = NULL, unpenalized = 1, ranInd = 1, control = glmmlassoControl(family = family), ...)

### Arguments

 y response variable of length n. group grouping variable of length n. X fixed-effects matrix as an n x p matrix. An intercept has to be included in the first column as (1,...,1). Z random-effects matrix as an n x q matrix, must be in sparse Matrix format (see package Matrix). family GLM family. Currently only "binomial" and "poisson" are implemented. covStruct Covariance Stucture to be used. "Identity" fits \bm{Σ}_{θ}=θ^2\bm{1}_q, i.e. one single covariance parameter for all random effects. "Diagonal" fits a diagonal matrix for \bm{Σ}_{θ}, i.e. for each random effect a different covariance parameter. "Sym" fits an unstructured symmetric matrix. lambda non-negative regularization parameter weights weights for the fixed-effects covariates: NA means no penalization, 0 means drop this covariate ; if given, the argument unpenalized is ignored. By default each covariate has weight 1 coefInit starting values. They must be of the same length as the model to be fitted, i.e. the number of variables p, the number of random-effects q and the number of covariance parameters stot must coincide. Otherwise a warning is issued. exactStep logical. Should the Armijo step include the update of the random effects u to ensure that the objective function strictly decreases? exactDeriv logical. Should the exact derivate be calculated or the derivative for fixed random effects u. random expression for the random-effects structure for non-correlated random effects of the form "(1|group)+(0+X2|group)+(0+X3|group)". It is used only for generating the corresponding Z matrix, and it dominates the Z matrix, i.e. if random and a Z matrix is given, the Z matrix corresponding to random is used. unpenalized indices as subset of \{1,...,p\} indicating which fixed-effects covariates are not subject to the \ell_1-penalty. Ignored if weights is given. ranInd indices of the random effects as subset of \{1,...,p\}. Only used for summary.glmmlasso. They only need to be specified if the random-effects covariates do not corespond to the first columns in X. control control parameters for the algorithm, see lmmlassoControl for the details ... not used.

### Details

All the details of the algorithm can be found in the article.

Concerning logLik, deviance, aic and bic, we have to be very careful when comparing them with other generalized linear mixed model functions. If we study a low-dimensional data example and set λ=0, the log-likelihood is calculated as given in the paper. Deviance, aic and bic are computed such that they coincide with the results from glmer from the lme4 package. The latter does not employ the standard definitions of deviance and log-likelihood function value. Nevertheless, the differences only depends on constants, and not the parameters.

### Value

A glmmlasso object is returned, for which coef,resid, fitted, logLik print, summary, plot methods exist.

 fixef fixed-effects parameter beta coefficients fixed-effects parameter beta theta covariance parameter estimates ranef random effects in sparse vector format u random effects in dense vector format objective Value of the objective function corresponding to the estimates logLik value of the log-likelihood function. See details. deviance value of the deviance function. See details. aic AIC. See details. bic BIC. See details. activeSet Indices of the non-zero fixed-effects coefficients. eta The linear predictor at the current values. mu The conditional expectation of the response at the current values. fitted The fitted values at the current values. lambda non-negative regularization parameter weights weights (possible adapted to the argument weights) data data. List with y, group, X and Z family GLM family used. ntot total number of observations p number of fixed-effects parameters N number of groups/clusters unpenalized indices of the non-penalized fixed-effects covariates ranInd indices of the random effects as subset of \{1,...,p\}. exactStep logical. If the Armijo step includes the update of the random effects u or not. exactDeriv logical. If the exact derivate has been calculated or if the derivative for fixed random effects u has been calculated. coefInit starting values used in the algorithm for beta, theta and u coefOut list with estimates in the form required for the argument coefInit convergence integer giving convergence information. Each time maxArmijo was reached, convergence is increased by 2. If maxIter was reached, convergence is increased by 1. nIter number of outer iterations. nIterPirls number of pirls evaluation within the outer iteration. Pirls-Evaluation within the Armijo steps are not counted. nfctEval number of function evaluation. See value fctEval. fctEval vector of all function values calculated during the algorithm. It may be interesting if studying the convergence behaviour of the algorithm. Only if argument fctSave=TRUE gradient gradient of the objective function with respect to the fixed-effects coefficients maxGrad maximal value of the gradient which has to be close to zero for convergence. maxArmijo the maximal value of l used in each fixed-effects component. control see lmmlassoControl resid response residuals. See McCullagh and Nelder (1989). pearResid pearson residuals. See McCullagh and Nelder (1989). respResid response residuals. See McCullagh and Nelder (1989). workResid working residuals. See McCullagh and Nelder (1989). devResid deviance residuals. See McCullagh and Nelder (1989). Var conditional variance of the response at the current values. See McCullagh and Nelder (1989). di contribution of each observation to the deviance. See McCullagh and Nelder (1989). gof goodness-of-fit criterion. For family="binomial", it is the in-sample missclassification rate. For family="poisson", it is the Pearson X2 statistic. See McCullagh and Nelder (1989). cpu cpu time needed for the algorithm.

### Author(s)

Juerg Schelldorfer, schell@stat.math.ethz.ch

### References

P. McCullagh and J. A. Nelder (1989), Generalized Linear Models, Chapman-Hall, London.

### Examples

 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 ## (1) Use glmmlasso on the xerop data set data(xerop) fit1 <- glmmlasso(y=xerop$y,group=xerop$group,X=xerop$X,Z=xerop$Z, family="binomial",covStruct="Identity",lambda=30) summary(fit1) plot(fit1) ## (2) Use the glmmlasso on a small simulated data set set.seed(142) N <- 40 ## number of groups p <- 6 ## number of covariates (including intercept) q <- 2 ## number of random effect covariates ni <- rep(10,N) ## observations per group ntot <- sum(ni) ## total number of observations group <- factor(rep(1:N,times=ni)) ## grouping variable beta <- c(0,1,-1,1,-1,0,0) # fixed-effects coefficients X <- cbind(1,matrix(rnorm(ntot*p),nrow=ntot)) ## fixed-effects design matrix Z <- t(glmer(rbinom(ntot,1,runif(ntot,0.3,1))~1+(1|group)+(0+X2|group), data=data.frame(X),family="binomial")@Zt) ## random-eff. design matrix bi <- c(rnorm(N,0,2),rnorm(N,0,1)) eta <- X%*%beta + Z%*%bi mu <- exp(eta)/(1+exp(eta)) y <- rbinom(ntot,1,mu@x) ## random effects model with diagonal covariance matrix fit2 <- glmmlasso(y=y, group=group, X=X, Z=Z, family="binomial", lambda=10, unpenalized=1:2, covStruct="Diagonal") summary(fit2) plot(fit2)

Questions? Problems? Suggestions? or email at ian@mutexlabs.com.

All documentation is copyright its authors; we didn't write any of that.