lqa: Fitting penalized Generalized Linear Models with the LQA...
In lqa: Penalized Likelihood Inference for GLMs

Description Usage Arguments Details Value Author(s) See Also Examples

‘lqa’ is used to fit penalized generalized linear models, specified by giving a symbolic description of the linear predictor and descriptions of the error distribution and the penalty.

lqa (x, ...)

lqa.update2 (x, y, family = NULL, penalty = NULL, intercept = TRUE, 
             weights = rep (1, nobs), control = lqa.control (), 
             initial.beta, mustart, eta.new, gamma1 = 1, ...)

## S3 method for class 'formula'
lqa(formula, data = list (), weights = rep (1, nobs), subset, 
            na.action, start = NULL, etastart, mustart, offset, ...)

## Default S3 method:
lqa(x, y, family = gaussian (), penalty = NULL, method = "lqa.update2", 
            weights = rep (1, nobs), start = NULL, 
            etastart = NULL, mustart = NULL, offset = rep (0, nobs), 
            control = lqa.control (), intercept = TRUE, 
            standardize = TRUE, ...)

`formula`	a symbolic description of the model to be fit. The details of model specification are given below.
`data`	an optional data frame containing the variables in the model. If not found in ‘data’, the variables are taken from ‘environment(formula)’, typically the environment from which ‘lqa’ is called.
`family`	a description of the error distribution and link function to be used in the model. This can be a character string naming a family function, a family function or the result of a call to a family function. (See `family()` for details of family functions.)
`penalty`	a description of the penalty to be used in the fitting procedure. This must be a penalty object. See `penalty` for details on penalty functions.
`method`	a character string naming the function used to estimate the model. The default value `method = lqa.update2` applies the LQA algorithm.
`intercept`	a logical object whether the model should include an intercept (this is recommended) or not. The default value is TRUE.
`standardize`	a logical object, whether the regressors should be standardized (this is recommended) or not. The default value is TRUE.
`weights`	an optional vector of weights to be used in the fitting process.
`start`	starting values for the parameters in the linear predictor.
`etastart`	starting values for the linear predictor.
`mustart`	starting values for the vector of means (response).
`gamma1`	additional step length parameter used in `lqa.update2` to enforce convergence if necessary.
`offset`	this can be used to specify an a priori known component to be included in the linear predictor during fitting.
`control`	a list of parameters for controlling the fitting process. See the documentation of `lqa.control` for details.
`na.action`	a function which indicates what should happen when the data contain ‘NA’s.
`subset`	an optional vector specifying a subset of observations to be used in the fitting process.
`x, y`	Used in ‘lqa.default’: ‘x’ is a design matrix (with additional column of ones if an intercept should be included in the model) of dimension ‘n * p’, and ‘y’ is a vector of observations of length ‘n’.
`initial.beta`	optional initial values of beta used in the fitting procedures.
`eta.new`	optional intial values of predictors used in the fitting procedures.
`...`	further arguments passed to or from other methods.

A typical formula has the form ‘response ~ terms’ where 'response' is the (numeric) response vector and ‘terms’ is a series of terms which specifies a linear predictor for ‘response’. The use is similar to that of the glm() function. As there, the right hand side of the model formula specifies the form of the linear predictor and hence gives the link function of the mean of the response, rather than the mean of the response directly. Per default an intercept is included in the model. If it should be removed then use formulae of the form ‘response ~ 0 + terms’ or ‘response ~ terms - 1’.

Also lqa takes a family argument, which is used to specify the distribution from the exponential family to use, and the link function that is to go with it. The default value is the canonical link.

lqa returns an object of class lqa which inherits from the classes glm and lm.

The generic accessor functions coefficients, fitted.values and residuals can be used to extract various useful features of the object returned by lqa.

Note it is highly recommended to include an intercept in the model (e.g. use Intercept = TRUE). If you use Intercept = FALSE in the classical linear model then make sure that your y argument is already centered! Otherwise the model would not be valid.

An object of class lqa is a list containing at least the following components:

`coefficients`	a named vector of unstandardized coefficients.
`residuals`	the residuals based on the estimated coefficients.
`fitted.values`	the fitted mean values, obtained by transforming the linear predictors by the inverse of the link function.
`family`	the `family` object used.
`penalty`	the `penalty` object used, indicating which penalty has been used.
`linear.predictors`	the linear fit on link scale.
`deviance`	up to a constant, minus twice the maximimized (unpenalized) log-likelihood.
`aic`	Akaike's Information Criterion, minus twice the maximized log-likelihood plus twice the trace of the hat matrix (so assuming that the dispersion is known).
`bic`	Bayesian Information Criterion, minus twice the maximized log-likelihood plus log (nobs) times the trace of the hat matrix (so assuming that the dispersion is known).
`null.deviance`	deviance of the null model (that only includes a constant)
`n.iter`	the number of iterations until convergence.
`best.iter`	the number of iterations until AIC reaches its minimum.
`weights`	diagonal elements of the weight matrix in GLMs.
`prior.weights`	the weights as optionally given as argument.
`df.residual`	the residual degrees of freedom.
`df.null`	the residual degrees of freedom for the null model.
`converged`	a logical variable. TRUE if the algorithm indeed converged.
`mean.x`	The vector of means of the regressors.
`norm.x`	The vector of Euclidean norms of the regressors.
`Amat`	The quadratically approximated penalty matrix corresponding to the penalty used.
`method`	The argument indicating the fitting method.
`rank`	The trace of the hat matrix.
`y`	the original response vector used to fit the model.
`x`	the original regressor matrix (including an intercept if given) used to fit the model.
`fit.obj`	the fitted object as returned from the fitting method (e.g. from `lqa.update2`).

Jan Ulbricht

cv.lqa, penalty

set.seed (1111)

n <- 200
p <- 5
X <- matrix (rnorm (n * p), ncol = p)
X[,2] <- X[,1] + rnorm (n, sd = 0.1)
X[,3] <- X[,1] + rnorm (n, sd = 0.1)
true.beta <- c (1, 2, 0, 0, -1)
y <- drop (X %*% true.beta) + rnorm (n)

obj1 <- lqa (y ~ X, family = gaussian (), penalty = lasso (1.5), 
             control = lqa.control ())
obj1$coef


set.seed (4321)

n <- 25
p <- 5
X <- matrix (rnorm (n * p), ncol = p)
X[,2] <- X[,1] + rnorm (n, sd = 0.1)
X[,3] <- X[,1] + rnorm (n, sd = 0.1)
true.beta <- c (1, 2, 0, 0, -1)

family1 <- binomial ()
eta.true <- drop (X %*% true.beta)
mu.true <- family1$linkinv (eta.true)
prob1 <- sum (as.integer (y > 0)) / n
nvec <- 1 : n
y2 <- sapply (mu.true, function (n.vec) {rbinom (1, 1, mu.true)})

obj2 <- lqa (y2 ~ X, family = binomial (), 
             penalty = fused.lasso (c (0.0001, 0.2)))
obj2$coef