Description Usage Arguments Details Value References See Also Examples
Solves proximal operator of the latent group Lasso appearing in Yan & Bien (2017)
min_β || y - β ||_2^2 + lam * Ω(β; w)
where Ω(β; w) = min_{sum_l v^(l) = β; supp(v^(l)) \subset g_l} w_l * || v^(l) ||_2 is known as the latent group lasso penalty as defined in Jacob et al. (2009). In the problem, β is a length-p parameter vector and its elements are embedded in a directed acyclic graph (DAG). The desired sparsity pattern is a subgraph of the DAG such that if β_i embedded in node i are set to zero, all the parameters embedded in the descendant nodes of i are zeroed out as well. The problem is solved by breaking down the DAG into several path graphs for which closed-form solutions are available for the proximal operator corresponding with each path graph, and performing block coordinate descent across the path graphs. See Section 4.3 of the paper for more details and explanations.
1 2 |
y |
Length- |
lam |
Non-negative tuning parameter that controls the sparsity level. |
w |
Length- |
map |
Matrix of |
var |
Length- |
assign |
Matrix of |
w.assign |
List of length |
get.penalval |
If |
tol |
Tolerance level used in BCD. Convergence is assumed when no parameter of interest in each path graph changes by more than tol in BCD. |
maxiter |
Upperbound of the number of iterations that BCD to perform. |
beta.ma |
|
See Section 2.2 of the paper for problem setup and group structure specifications. See Figure 7 in Section 4.3 for an example of decomposing DAG into path graphs. See Algorithm 4 in paper for details of the path-based BCD.
Returns an estimate of the solution to the proximal operator of the latent group Lasso. The returned value is an exact solution if the DAG is a directed path graph.
beta |
A length- |
ite |
Number of cycles of BCD performed. |
penalval |
Value of the penalty lam * Ω(β; w) if
|
assign |
Value of |
w.assign |
Value of |
beta.ma |
|
Yan, X. and Bien, J. (2017). Hierarchical Sparse Modeling: A Choice of Two Group Lasso Formulations. Statist. Sci. 32, no. 4, 531–560. doi:10.1214/17-STS622.
Jacob, L., Obozinski, G. and Vert, J. (2009). Group Lasso with Overlap and Graph Lasso. In Proceedings of the 26th Annual International Conference on Machine Learning. ICML'09 433-440. ACM, New York.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 | # The following example appears in Figure 7 of Yan & Bien (2015).
# Generate map defining DAG.
map <- matrix(0, ncol=2, nrow=8)
map[1, ] <- c(1, 2)
map[2, ] <- c(2, 7)
map[3, ] <- c(3, 4)
map[4, ] <- c(4, 6)
map[5, ] <- c(6, 7)
map[6, ] <- c(6, 8)
map[7, ] <- c(3, 5)
map[8, ] <- c(5, 6)
# Assume one parameter per node.
# Let parameter and node share the same index.
var <- as.list(1:8)
set.seed(100)
y <- rnorm(8)
result <- hsm(y=y, lam=0.5, map=map, var=var, get.penalval=TRUE)
# Another example in which DAG contains two separate nodes
map <- matrix(0, ncol=2, nrow=2)
map[1, ] <- c(1, NA)
map[2, ] <- c(2, NA)
# Assume ten parameters per node.
var <- list(1:10, 11:20)
set.seed(100)
y <- rnorm(20)
lam <- 0.5
result <- hsm(y=y, lam=lam, map=map, var=var, get.penalval=TRUE)
# The solution is equivalent to performing group-wise soft-thresholdings
beta.st <- c(y[1:10] * max(0, 1 - lam * sqrt(10) / norm(y[1:10], "2")),
y[11:20] * max(0, 1 - lam * sqrt(10) / norm(y[11:20], "2")))
all.equal(result$beta, beta.st)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.