glamlassoS: Penalization in Large Scale Generalized Linear Array Models
In glamlasso: Penalization in Large Scale Generalized Linear Array Models

Description Usage Arguments Details Value Author(s) References Examples

View source: R/glamlassoS.R

Efficient design matrix free procedure for fitting a special case of a generalized linear model with array structured response and partially tensor structured covariates. See Lund and Hansen, 2019 for an application of this special purpose function.

glamlassoS(X, 
           Y,
           V, 
           Z = NULL,
           family = "gaussian",
           penalty = "lasso",
           intercept = FALSE,
           weights = NULL,
           betainit = NULL,
           alphainit = NULL,
           nlambda = 100,
           lambdaminratio = 1e-04,
           lambda = NULL,
           penaltyfactor = NULL,
           penaltyfactoralpha = NULL,
           reltolinner = 1e-07,
           reltolouter = 1e-04,
           maxiter = 15000,
           steps = 1,
           maxiterinner = 3000,
           maxiterouter = 25,
           btinnermax = 100,
           btoutermax = 100,
           iwls = "exact",
           nu = 1)

`X`	A list containing the tensor components (2 or 3) of the tensor design matrix. These are matrices of sizes n_i \times p_i.
`Y`	The response values, an array of size n_1 \times\cdots\times n_d. For option `family = "binomial"` this array must contain the proportion of successes and the number of trials is then specified as `weights` (see below).
`V`	The weight values, an array of size n_1 \times\cdots\times n_d.
`Z`	The non tensor structrured part of the design matrix. A matrix of size n_1 \cdots n_d\times q. Is set to `NULL` as default.
`family`	A string specifying the model family (essentially the response distribution). Possible values are `"gaussian", "binomial", "poisson", "gamma"`.
`penalty`	A string specifying the penalty. Possible values are `"lasso", "scad"`.
`intercept`	Logical variable indicating if the model includes an intercept. When `intercept = TRUE` the first coulmn in the non-tensor design component `Z` is all 1s. Default is `FALSE`.
`weights`	Observation weights, an array of size n_1 \times \cdots \times n_d. For option `family = "binomial"` this array must contain the number of trials and must be provided.
`betainit`	The initial parameter values. Default is NULL in which case all parameters are initialized at zero.
`alphainit`	A q\times 1 vector containing the initial parameter values for the non-tensor parameter. Default is NULL in which case all parameters are initialized at 0.
`nlambda`	The number of `lambda` values.
`lambdaminratio`	The smallest value for `lambda`, given as a fraction of λ_{max}; the (data derived) smallest value for which all coefficients are zero.
`lambda`	The sequence of penalty parameters for the regularization path.
`penaltyfactor`	An array of size p_1 \times \cdots \times p_d. Is multiplied with each element in `lambda` to allow differential shrinkage on the coefficients.
`penaltyfactoralpha`	A q \times 1 vector multiplied with each element in `lambda` to allow differential shrinkage on the non-tensor coefficients.
`reltolinner`	The convergence tolerance for the inner loop
`reltolouter`	The convergence tolerance for the outer loop.
`maxiter`	The maximum number of inner iterations allowed for each `lambda` value, when summing over all outer iterations for said `lambda`.
`steps`	The number of steps used in the multi-step adaptive lasso algorithm for non-convex penalties. Automatically set to 1 when `penalty = "lasso"`.
`maxiterinner`	The maximum number of inner iterations allowed for each outer iteration.
`maxiterouter`	The maximum number of outer iterations allowed for each lambda.
`btinnermax`	Maximum number of backtracking steps allowed in each inner iteration. Default is `btinnermax = 100`.
`btoutermax`	Maximum number of backtracking steps allowed in each outer iteration. Default is `btoutermax = 100`.
`iwls`	A string indicating whether to use the exact iwls weight matrix or use a kronecker structured approximation to it.
`nu`	A number between 0 and 1 that controls the step size δ in the proximal algorithm (inner loop) by scaling the upper bound \hat{L}_h on the Lipschitz constant L_h (see Lund et al., 2017). For `nu = 1` backtracking never occurs and the proximal step size is always δ = 1 / \hat{L}_h. For `nu = 0` backtracking always occurs and the proximal step size is initially δ = 1. For `0 < nu < 1` the proximal step size is initially δ = 1/(ν\hat{L}_h) and backtracking is only employed if the objective function does not decrease. A `nu` close to 0 gives large step sizes and presumably more backtracking in the inner loop. The default is `nu = 1` and the option is only used if `iwls = "exact"`.

Given the setting from glamlasso we consider a model where the tensor design component is only partially tensor structured as

X = [V_1X_2^\top\otimes X_1^\top,…,V_{n_3}X_2^\top\otimes X_1^\top]^\top.

Here X_i is a n_i\times p_i matrix for i=1,2 and V_i is a n_1n_2\times n_1n_2 diagonal matrix for i=1,…,n_3.

Letting Y denote the n_1\times n_2\times n_3 response array and V the n_1\times n_2\times n_3 weight array containing the diagonals of the V_is, the function glamlassoS solves the PMLE problem using Y, V, X_1, X_2 and the non-tensor component Z as input.

An object with S3 Class "glamlasso".

`spec`	A string indicating the model family and the penalty.
`beta`	A p_1\cdots p_d \times `nlambda` matrix containing the estimates of the parameters for the tensor structured part of the model (`beta`) for each `lambda`-value.
`alpha`	A q \times `nlambda` matrix containing the estimates of the parameters for the non tensor structured part of the model (`alpha`) for each `lambda`-value. If `intercept = TRUE` the first row contains the intercept estimate for each `lambda`-value.
`lambda`	A vector containing the sequence of penalty values used in the estimation procedure.
`df`	The number of nonzero coefficients for each value of `lambda`.
`dimcoef`	A vector giving the dimension of the model coefficient array β.
`dimobs`	A vector giving the dimension of the observation (response) array `Y`.
`Iter`	A list with 4 items: `bt_iter_inner` is total number of backtracking steps performed in the inner loop, `bt_enter_inner` is the number of times the backtracking is initiated in the inner loop, `bt_iter_outer` is total number of backtracking steps performed in the outer loop, and `iter_mat` is a `nlambda` \times `maxiterouter` matrix containing the number of inner iterations for each `lambda` value and each outer iteration and `iter` is total number of iterations i.e. `sum(Iter)`.

Adam Lund

Maintainer: Adam Lund, adam.lund@math.ku.dk

Lund, A., M. Vincent, and N. R. Hansen (2017). Penalized estimation in large-scale generalized linear array models. Journal of Computational and Graphical Statistics, 26, 3, 709-724. url = https://doi.org/10.1080/10618600.2017.1279548.

Lund, A. and N. R. Hansen (2019). Sparse Network Estimation for Dynamical Spatio-temporal Array Models. Journal of Multivariate Analysis, 174. url = https://doi.org/10.1016/j.jmva.2019.104532.

##size of example
n1 <- 65; n2 <- 26; n3 <- 13; p1 <- 13; p2 <- 5; 

##marginal design matrices (tensor components)
X1 <- matrix(rnorm(n1 * p1), n1, p1)
X2 <- matrix(rnorm(n2 * p2), n2, p2)
X <- list(X1, X2)
V <- array(rnorm(n3 * n2 * n1), c(n1, n2, n3))

##gaussian example
Beta <- array(rnorm(p1 * p2) * rbinom(p1 * p2, 1, 0.1), c(p1 , p2))
Mu <- V * array(RH(X2, RH(X1, Beta)), c(n1, n2, n3))
Y <- array(rnorm(n1 * n2 * n3, Mu), c(n1, n2, n3))
system.time(fit <- glamlassoS(X, Y, V))

modelno <- length(fit$lambda)
plot(c(Beta), ylim = range(Beta, fit$coef[, modelno]), type = "h")
points(c(Beta))
lines(c(fit$coef[, modelno]), col = "red", type = "h")

###with non tensor design component Z
q <- 5
alpha <- matrix(rnorm(q)) * rbinom(q, 1, 0.5)
Z <- matrix(rnorm(n1 * n2 * n3 * q), n1 * n2 *n3, q) 
Y <- array(rnorm(n1 * n2 * n3, Mu + array(Z %*% alpha, c(n1, n2, n3))), c(n1, n2, n3))
system.time(fit <- glamlassoS(X, Y, V , Z))

modelno <- length(fit$lambda)
oldmfrow <- par()$mfrow
par(mfrow = c(1, 2))
plot(c(Beta), type="h", ylim = range(Beta, fit$coef[, modelno]))
points(c(Beta))
lines(fit$coef[ , modelno], col = "red", type = "h")
plot(c(alpha), type = "h", ylim = range(alpha, fit$alpha[, modelno]))
points(c(alpha))
lines(fit$alpha[ , modelno], col = "red", type = "h")
par(mfrow = oldmfrow)

################ poisson example
Beta <- matrix(rnorm(p1 * p2, 0, 0.1) * rbinom(p1 * p2, 1, 0.1), p1 , p2)
Mu <- V * array(RH(X2, RH(X1, Beta)), c(n1, n2, n3))
Y <- array(rpois(n1 * n2 * n3, exp(Mu)), dim = c(n1, n2, n3))
system.time(fit <- glamlassoS(X, Y, V, family = "poisson", nu = 0.1))

modelno <- length(fit$lambda)
plot(c(Beta), type = "h", ylim = range(Beta, fit$coef[, modelno]))
points(c(Beta))
lines(fit$coef[ , modelno], col = "red", type = "h")