MLGL: Multi-Layer Group-Lasso
In MLGL: Multi-Layer Group-Lasso

View source: R/MLGL.R

MLGL	R Documentation

Multi-Layer Group-Lasso

Description

Run hierarchical clustering following by a group-lasso on all the different partitions.

Usage

MLGL(X, ...)

## Default S3 method:
MLGL(
  X,
  y,
  hc = NULL,
  lambda = NULL,
  weightLevel = NULL,
  weightSizeGroup = NULL,
  intercept = TRUE,
  loss = c("ls", "logit"),
  sizeMaxGroup = NULL,
  verbose = FALSE,
  ...
)

## S3 method for class 'formula'
MLGL(
  formula,
  data,
  hc = NULL,
  lambda = NULL,
  weightLevel = NULL,
  weightSizeGroup = NULL,
  intercept = TRUE,
  loss = c("ls", "logit"),
  verbose = FALSE,
  ...
)

Arguments

`X`	matrix of size n*p
`...`	Others parameters for `gglasso` function
`y`	vector of size n. If loss = "logit", elements of y must be in -1,1
`hc`	output of `hclust` function. If not provided, `hclust` is run with `ward.D2` method. User can also provide the desired method: "single", "complete", "average", "mcquitty", "ward.D", "ward.D2", "centroid", "median".
`lambda`	lambda values for group lasso. If not provided, the function generates its own values of lambda
`weightLevel`	a vector of size p for each level of the hierarchy. A zero indicates that the level will be ignored. If not provided, use 1/(height between 2 successive levels). Only if `hc` is provided
`weightSizeGroup`	a vector of size 2*p-1 containing the weight for each group. Default is the square root of the size of each group. Only if `hc` is provided
`intercept`	should an intercept be included in the model ?
`loss`	a character string specifying the loss function to use, valid options are: "ls" least squares loss (regression) and "logit" logistic loss (classification)
`sizeMaxGroup`	maximum size of selected groups. If NULL, no restriction
`verbose`	print some information
`formula`	an object of class "formula" (or one that can be coerced to that class): a symbolic description of the model to be fitted.
`data`	an optional data.frame, list or environment (or object coercible by as.data.frame to a data.frame) containing the variables in the model. If not found in data, the variables are taken from environment (formula)

Value

a MLGL object containing:

lambda: lambda values
b0: intercept values for lambda
beta: A list containing the values of estimated coefficients for each values of lambda
var: A list containing the index of selected variables for each values of lambda
group: A list containing the values index of selected groups for each values of lambda
nVar: A vector containing the number of non zero coefficients for each values of lambda
nGroup: A vector containing the number of non zero groups for each values of lambda
structure: A list containing 3 vectors. var: all variables used. group: associated groups. weight: weight associated with the different groups. level: for each group, the corresponding level of the hierarchy where it appears and disappears. 3 indicates the level with a partition of 3 groups.
time: computation time
dim: dimension of X
hc: Output of hierarchical clustering
call: Code executed by user

Author(s)

Quentin Grimonprez

Examples

set.seed(42)
# Simulate gaussian data with block-diagonal variance matrix containing 12 blocks of size 5
X <- simuBlockGaussian(50, 12, 5, 0.7)
# Generate a response variable
y <- X[, c(2, 7, 12)] %*% c(2, 2, -2) + rnorm(50, 0, 0.5)
# Apply MLGL method
res <- MLGL(X, y)

MLGL documentation built on March 31, 2023, 9:32 p.m.