fit: Fit a linear multiple output model using sparse group lasso
In lsgl: Linear Multiple Output Sparse Group Lasso

Description Usage Arguments Details Value Author(s) Examples

For a linear multiple output model with p features (covariates) dived into m groups using sparse group lasso.

1
2
3

fit(x, y, intercept = TRUE, weights = NULL, grouping = NULL,
  groupWeights = NULL, parameterWeights = NULL, alpha = 1, lambda,
  d = 100, algorithm.config = lsgl.standard.config)

`x`	design matrix, matrix of size N \times p.
`y`	response matrix, matrix of size N \times K.
`intercept`	should the model include intercept parameters.
`weights`	sample weights, vector of size N \times K.
`grouping`	grouping of features, a factor or vector of length p. Each element of the factor/vector specifying the group of the feature.
`groupWeights`	the group weights, a vector of length m (the number of groups).
`parameterWeights`	a matrix of size K \times p.
`alpha`	the α value 0 for group lasso, 1 for lasso, between 0 and 1 gives a sparse group lasso penalty.
`lambda`	lambda.min relative to lambda.max or the lambda sequence for the regularization path.
`d`	length of lambda sequence (ignored if `length(lambda) > 1`)
`algorithm.config`	the algorithm configuration to be used.

This function computes a sequence of minimizers (one for each lambda given in the lambda argument) of

\frac{1}{N}\|Y-Xβ\|_F^2 + λ ≤ft( (1-α) ∑_{J=1}^m γ_J \|β^{(J)}\|_2 + α ∑_{i=1}^{n} ξ_i |β_i| \right)

where \|\cdot\|_F is the frobenius norm. The vector β^{(J)} denotes the parameters associated with the J'th group of features. The group weights are denoted by γ \in [0,∞)^m and the parameter weights by ξ \in [0,∞)^n.

`beta`	the fitted parameters – the list \hatβ(λ(1)), …, \hatβ(λ(d)) of length `length(return)`. With each entry of list holding the fitted parameters, that is matrices of size K\times p (if `intercept = TRUE` matrices of size K\times (p+1))
`loss`	the values of the loss function.
`objective`	the values of the objective function (i.e. loss + penalty).
`lambda`	the lambda values used.

Martin Vincent

set.seed(100) # This may be removed, ensures consistency of tests

# Simulate from Y = XB + E,
# the dimension of Y is N x K, X is N x p, B is p x K

N <- 50 # number of samples
p <- 50 # number of features
K <- 25 # number of groups

B <- matrix(
sample(c(rep(1,p*K*0.1), rep(0, p*K-as.integer(p*K*0.1)))),
	nrow = p,ncol = K)

X <- matrix(rnorm(N*p,1,1), nrow=N, ncol=p)
Y <- X %*% B + matrix(rnorm(N*K,0,1), N, K)

fit <-lsgl::fit(X,Y, alpha=1, lambda = 0.1, intercept=FALSE)

## ||B - \beta||_F
sapply(fit$beta, function(beta) sum((B - beta)^2))

## Plot
par(mfrow = c(3,1))
image(B, main = "True B")
image(
x = as.matrix(fit$beta[[100]]),
	main = paste("Lasso estimate (lambda =", round(fit$lambda[100], 2), ")")
)
image(solve(t(X)%*%X)%*%t(X)%*%Y, main = "Least squares estimate")

# The training error of the models
Err(fit, X, loss="OVE")
# This is simply the loss function
sqrt(N*fit$loss)