lmmlasso: Function to fit high-dimensional (gaussian) linear...
In lmmlasso: Linear mixed-effects models with Lasso

Description Usage Arguments Details Value Author(s) References Examples

Fits the solution for a high-dimensional (gaussian) linear mixed-effects models

lmmlasso(x, ...)

## Default S3 method:
lmmlasso(x, y, z = x, grp, weights = NULL,coefInit=NULL, lambda,
 startValue = 1, nonpen = 1:dim(z)[[2]], pdMat = c("pdIdent", "pdDiag","pdSym"),
 method = "ML", CovOpt = c("nlminb","optimize"), stopSat = TRUE,
 standardize = TRUE, control = lmmlassoControl(),ranInd=1:dim(z)[[2]],...)

`x`	matrix of dimension ntot x p including the fixed-effects covariables. An intercept has to be included in the first column as (1,...,1).
`y`	response variable of length ntot.
`z`	random effects matrix of dimension ntot x q. It has to be a matrix, even if q=1.
`grp`	grouping variable of length ntot
`weights`	weights for the fixed-effects covariates: NA means no penalization, 0 means drop this covariate ; if given, the argument nonpen is ignored. By default each covariate has weight 1
`coefInit`	list with three components used as starting values for the fixed effects, the random effects variance components and the error standard deviation.
`lambda`	positive regularization parameter
`startValue`	Choice of the starting values for the fixed effects using linear regression. 1 means 10-fold cross-validation with L1-penalty, 2 means 10-fold cross-validation Ridge Regression and 0 means that all the covariates are set to zero and the intercept is the mean of the response variable
`nonpen`	index of fixed effects with no penalization, ignored if the argument weights is specified, default is 1, which means that only the intercept (the first column in X )is not penalized.
`pdMat`	Covariance structure for the random effects. pdIdent, pdDiag and pdSym are already implemented. Default to pdIdent. pdIdent: b_i \sim \mathcal{N}(0,θ^2 I) (1 parameter), pdDiag: b_i \sim \mathcal{N}(0,diag(θ_1,…,θ_q)) (q parameters), pdSym: b_i \sim \mathcal{N}(0,Ψ) where Ψ is symmetric positive definit (q(q+1)/2 parameters)
`method`	Only "ML" is allowed. "REML" is not yet implemented.
`CovOpt`	which optimization routine should be used for updating the variance components parameters. optimize or nlminb. nlminb uses the estimate of the last iteration as a starting value. nlminb is faster if there are many Gauss-Seidel iterations.
`stopSat`	logical. Should the algorithm stop when ntot > p?
`standardize`	Should the x matrix be standardized such that each column has mean 0 and standard deviation 1? Be careful if the x matrix includes dummy variables.
`control`	control parameters for the algorithm and the Armijo Rule, see `lmmlassoControl` for the details
`ranInd`	Index of the random effects with respect to the x matrix
`...`	not used.

All the details of the algorithm can be found in Schelldorfer et. al. (2010).

A lmmlasso object is returned, for which coef,resid, fitted, print, summary, plot methods exist.

`coefficients`	estimated fixed-effects coefficients \hat{β}
`pars`	free parameters in the covariance matrix Ψ of the random effects
`sigma`	standard deviation \hat{σ} of the errors
`random`	vector with random effects, sorted by groups
`u`	vector with the standardized random effects, sorted by effect
`ranef`	vector with random effects, sorted by effect
`fixef`	estimated fixed-effects coeffidients \hat{β}
`fitted.values`	The fitted values \hat{y} = \hat{X} β + Z \hat{b}_i
`residuals`	raw residuals y-\hat{y}
`Psi`	Covariance matrix Ψ of the random effects
`corPsi`	Correlation matrix of the random effects
`logLik`	value of the log-likelihood function
`deviance`	deviance=-2*logLik
`npar`	Number of parameters. Corresponds to the cardinality of the active set of `coefficients` plus the number of free parameters in `Psi`
`aic`	AIC
`bic`	BIC
`data`	data set, as a list with four components: x, y, z, grp (see above)
`weights`	weights for the fixed-effects covariates
`nonpen`	nonpenalized covariates. Differ from the input if weights is explicitely given
`coefInit`	list with three components used as starting values
`lambda`	positive regularization parameter
`converged`	Does the algorithm converge? 0: correct convergence ; an odd number means that maxIter was reached ; an even number means that the Armijo step was not succesful. For each unsuccessfull Armijo step, 2 is added to converged. If converged is large compared to the number of iterations `counter`, you may increase maxArmijo.
`counter`	The number of iterations used.
`stopped`	logical. Does the algorithm stopped due to ntot > p?
`pdMat`	Covariance structure for the random effects
`method`	"ML"
`CovOpt`	optimization routine
`control`	see `lmmlassoControl`
`call`	call
`stopped`	logical. Does the algorithm stopped due to a too large active set?
`ranInd`	Index of the random effects with respect to the x matrix
`objective`	Value of the objective function at the final estimates

Juerg Schelldorfer, schell@stat.math.ethz.ch

J. Schelldorfer, P. B\"uhlmann and S. van de Geer (2011), Estimation for High-Dimensional Linear Mixed-Effects Models Using \ell_1-penalization, arXiv preprint 1002.3784v2

# (1) Use the lmmlasso on the classroomStudy data set
data(classroomStudy)
fit1 <-
lmmlasso(x=classroomStudy$X,y=classroomStudy$y,z=classroomStudy$Z,
grp=classroomStudy$grp,lambda=15,pdMat="pdIdent")
summary(fit1)
plot(fit1)

# (2) Use the lmmlasso on a small simulated data set
set.seed(54)

N <- 20           # number of groups
p <- 6            # number of covariates (including intercept)
q <- 2            # number of random effect covariates
ni <- rep(6,N)    # observations per group
ntot <- sum(ni)   # total number of observations

grp <- factor(rep(1:N,each=ni)) # grouping variable

beta <- c(1,2,4,3,0,0,0) # fixed-effects coefficients
x <- cbind(1,matrix(rnorm(ntot*p),nrow=ntot)) # design matrix

bi1 <- rep(rnorm(N,0,3),each=ni) # Psi=diag(3,2)
bi2 <- rep(rnorm(N,0,2),each=ni)
bi <- rbind(bi1,bi2)
   
z <- x[,1:2,drop=FALSE]

y <- numeric(ntot)
for (k in 1:ntot) y[k] <- x[k,]%*%beta + t(z[k,])%*%bi[,grp[k]] + rnorm(1)

# correct random effects structure
fit2 <- lmmlasso(x=x,y=y,z=z,grp=grp,lambda=10,pdMat="pdDiag")
summary(fit2)
plot(fit2)

# wrong random effects structure
fit3 <- lmmlasso(x=x,y=y,z=x[,1:3],grp=grp,lambda=10,pdMat="pdDiag")
summary(fit3)
plot(fit3)