lmmlasso: Function to fit high-dimensional (gaussian) linear...

Description Usage Arguments Details Value Author(s) References Examples

Description

Fits the solution for a high-dimensional (gaussian) linear mixed-effects models

Usage

1
2
3
4
5
6
7
lmmlasso(x, ...)

## Default S3 method:
lmmlasso(x, y, z = x, grp, weights = NULL,coefInit=NULL, lambda,
 startValue = 1, nonpen = 1:dim(z)[[2]], pdMat = c("pdIdent", "pdDiag","pdSym"),
 method = "ML", CovOpt = c("nlminb","optimize"), stopSat = TRUE,
 standardize = TRUE, control = lmmlassoControl(),ranInd=1:dim(z)[[2]],...)

Arguments

x

matrix of dimension ntot x p including the fixed-effects covariables. An intercept has to be included in the first column as (1,...,1).

y

response variable of length ntot.

z

random effects matrix of dimension ntot x q. It has to be a matrix, even if q=1.

grp

grouping variable of length ntot

weights

weights for the fixed-effects covariates: NA means no penalization, 0 means drop this covariate ; if given, the argument nonpen is ignored. By default each covariate has weight 1

coefInit

list with three components used as starting values for the fixed effects, the random effects variance components and the error standard deviation.

lambda

positive regularization parameter

startValue

Choice of the starting values for the fixed effects using linear regression. 1 means 10-fold cross-validation with L1-penalty, 2 means 10-fold cross-validation Ridge Regression and 0 means that all the covariates are set to zero and the intercept is the mean of the response variable

nonpen

index of fixed effects with no penalization, ignored if the argument weights is specified, default is 1, which means that only the intercept (the first column in X )is not penalized.

pdMat

Covariance structure for the random effects. pdIdent, pdDiag and pdSym are already implemented. Default to pdIdent. pdIdent: b_i \sim \mathcal{N}(0,θ^2 I) (1 parameter), pdDiag: b_i \sim \mathcal{N}(0,diag(θ_1,…,θ_q)) (q parameters), pdSym: b_i \sim \mathcal{N}(0,Ψ) where Ψ is symmetric positive definit (q(q+1)/2 parameters)

method

Only "ML" is allowed. "REML" is not yet implemented.

CovOpt

which optimization routine should be used for updating the variance components parameters. optimize or nlminb. nlminb uses the estimate of the last iteration as a starting value. nlminb is faster if there are many Gauss-Seidel iterations.

stopSat

logical. Should the algorithm stop when ntot > p?

standardize

Should the x matrix be standardized such that each column has mean 0 and standard deviation 1? Be careful if the x matrix includes dummy variables.

control

control parameters for the algorithm and the Armijo Rule, see lmmlassoControl for the details

ranInd

Index of the random effects with respect to the x matrix

...

not used.

Details

All the details of the algorithm can be found in Schelldorfer et. al. (2010).

Value

A lmmlasso object is returned, for which coef,resid, fitted, print, summary, plot methods exist.

coefficients

estimated fixed-effects coefficients \hat{β}

pars

free parameters in the covariance matrix Ψ of the random effects

sigma

standard deviation \hat{σ} of the errors

random

vector with random effects, sorted by groups

u

vector with the standardized random effects, sorted by effect

ranef

vector with random effects, sorted by effect

fixef

estimated fixed-effects coeffidients \hat{β}

fitted.values

The fitted values \hat{y} = \hat{X} β + Z \hat{b}_i

residuals

raw residuals y-\hat{y}

Psi

Covariance matrix Ψ of the random effects

corPsi

Correlation matrix of the random effects

logLik

value of the log-likelihood function

deviance

deviance=-2*logLik

npar

Number of parameters. Corresponds to the cardinality of the active set of coefficients plus the number of free parameters in Psi

aic

AIC

bic

BIC

data

data set, as a list with four components: x, y, z, grp (see above)

weights

weights for the fixed-effects covariates

nonpen

nonpenalized covariates. Differ from the input if weights is explicitely given

coefInit

list with three components used as starting values

lambda

positive regularization parameter

converged

Does the algorithm converge? 0: correct convergence ; an odd number means that maxIter was reached ; an even number means that the Armijo step was not succesful. For each unsuccessfull Armijo step, 2 is added to converged. If converged is large compared to the number of iterations counter, you may increase maxArmijo.

counter

The number of iterations used.

stopped

logical. Does the algorithm stopped due to ntot > p?

pdMat

Covariance structure for the random effects

method

"ML"

CovOpt

optimization routine

control

see lmmlassoControl

call

call

stopped

logical. Does the algorithm stopped due to a too large active set?

ranInd

Index of the random effects with respect to the x matrix

objective

Value of the objective function at the final estimates

Author(s)

Juerg Schelldorfer, schell@stat.math.ethz.ch

References

J. Schelldorfer, P. B\"uhlmann and S. van de Geer (2011), Estimation for High-Dimensional Linear Mixed-Effects Models Using \ell_1-penalization, arXiv preprint 1002.3784v2

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
# (1) Use the lmmlasso on the classroomStudy data set
data(classroomStudy)
fit1 <-
lmmlasso(x=classroomStudy$X,y=classroomStudy$y,z=classroomStudy$Z,
grp=classroomStudy$grp,lambda=15,pdMat="pdIdent")
summary(fit1)
plot(fit1)

# (2) Use the lmmlasso on a small simulated data set
set.seed(54)

N <- 20           # number of groups
p <- 6            # number of covariates (including intercept)
q <- 2            # number of random effect covariates
ni <- rep(6,N)    # observations per group
ntot <- sum(ni)   # total number of observations

grp <- factor(rep(1:N,each=ni)) # grouping variable

beta <- c(1,2,4,3,0,0,0) # fixed-effects coefficients
x <- cbind(1,matrix(rnorm(ntot*p),nrow=ntot)) # design matrix

bi1 <- rep(rnorm(N,0,3),each=ni) # Psi=diag(3,2)
bi2 <- rep(rnorm(N,0,2),each=ni)
bi <- rbind(bi1,bi2)
   
z <- x[,1:2,drop=FALSE]

y <- numeric(ntot)
for (k in 1:ntot) y[k] <- x[k,]%*%beta + t(z[k,])%*%bi[,grp[k]] + rnorm(1)

# correct random effects structure
fit2 <- lmmlasso(x=x,y=y,z=z,grp=grp,lambda=10,pdMat="pdDiag")
summary(fit2)
plot(fit2)

# wrong random effects structure
fit3 <- lmmlasso(x=x,y=y,z=x[,1:3],grp=grp,lambda=10,pdMat="pdDiag")
summary(fit3)
plot(fit3)

lmmlasso documentation built on May 2, 2019, 2:16 p.m.