knitr::opts_chunk$set( collapse = TRUE, comment = "#>", eval = FALSE )
This vignette serves the purpose to reproduce the results shown in the section "Real data example" of the manuscript describing collaborative graphical lasso. The computations carried out by the following code are heavy and could take a considerable amount of time. On our machine (RAM 48 Gb, CPU Intel(R) Xeon(R) W-2235 CPU @ 3.80GHz) it took almost 3 hours. Hence, we refer the user who wants to get started with the package to the introductory vignette: vignette("coglasso")
. Collaborative graphical lasso is an algorithm to reconstruct multi-omics networks based on graphical lasso (Friedman, Hastie and Tibshirani, 2008) and collaborative regression (Gross and Tibshirani, 2015). The coglasso package implements and R interface to this algorithm. We will use it to reproduce the results obtained in the original manuscript.
Let us first attach coglasso.
library(coglasso)
We will use the multi-omics data set provided by the package, in its largest version: multi_omics_sd
. Let's inspect its description in the manual.
help(multi_omics_sd)
We notice that this data set has 162 transcripts and 76 metabolites. For further information on the data set sources and generation, we refer the user to the script multi_omics_sd.R on the GitHub repository of the package.
For the multi-omics network reconstruction carried out in the manuscript we explored 30 automatically generated $λ_w$ and $λ_b$ values and 9 given $c$ values.
nlambda_w <- 30 nlambda_b <- 30 cs <- c(0.01, 0.05, 0.1, 0.2, 1, 5, 10, 20, 100)
We can now build the networks, one per combination of the chosen hyperparameters, with coglasso()
.
cg <- coglasso(multi_omics_sd, pX = 162, nlambda_w = nlambda_w, nlambda_b = nlambda_b, c = cs )
We proceed now to select the most stable, yet sparse network, using an adaptation to coglasso of StARS (Liu, Roeder and Wasserman, 2010). The function implementing this adapted version is stars_coglasso()
.
set.seed(42) sel_cg <- stars_coglasso(cg)
We display now the selected network using the package igraph
, reproducing the one displayed in Figure 3 of the manuscript. The orientation of the network may vary.
# To create the igraph object from the selected adjacency matrix: sel_graph <- igraph::graph.adjacency(sel_cg$sel_adj, mode = "undirected") # Setting some graphical parameters igraph::V(sel_graph)$label <- NA igraph::V(sel_graph)$color <- c(rep("#00ccff", 162), rep("#ff9999", 76)) igraph::V(sel_graph)$frame.color <- c(rep("#002060", 162), rep("#800000", 76)) igraph::V(sel_graph)$frame.width <- 2 igraph::V(sel_graph)$size <- 4 igraph::E(sel_graph)$width <- 0.3 # Setting a force-based network layout and removing disconnected nodes from the graph lo <- igraph::layout_with_fr(sel_graph) diconnected <- which(igraph::degree(sel_graph) == 0) sel_graph2 <- igraph::delete.vertices(sel_graph, diconnected) lo2 <- lo[-diconnected, ] # Plotting plot(sel_graph2, layout = lo2)
We now plot the subnetwork of the gene Cirbp and of its neighborhood. This reproduces the network displayed in Figure 4A.
igraph::V(sel_graph)$label<-colnames(sel_cg$data) cirbp_index <- which(colnames(sel_cg$data) == "Cirbp") subnetwork <- igraph::subgraph( sel_graph, c(cirbp_index, igraph::neighbors(sel_graph, cirbp_index)) ) plot(subnetwork)
A useful way to extract information from a network is community discovery. This technique allows to simplify a network by breaking it in functional units where nodes share several connections, enough to be recognized as separate communities. In the manuscript we used the greedy algorithm described in Clauset, Newman and Moore, 2004, part of the igraph
toolkit. We focused and inspected the second largest community, shown in Figure 4B. Here is the code to reproduce it.
communities <- igraph::cluster_fast_greedy(sel_graph) community2 <- communities[[2]] community2_graph<-igraph::subgraph(sel_graph, community2) # Focusing on nodes of interest fosjun_erg_AA <- c("Fos", "Fosb", "Junb", "Fosl2", "Egr1", "Egr2", "Egr3", "Ala", "Arg", "Asn","His","Ile","Leu","Lys","Met","Orn","Phe","Ser","Thr","Tyr","Val") igraph::V(community2_igraph)[!(label %in% fosjun_erg_AA)]$color<-c(rep("#3bd8ff", 29), rep("#ffadad", 5)) igraph::V(community2_igraph)[!(label %in% fosjun_erg_AA)]$frame.color<-NA igraph::V(community2_igraph)[!(label %in% fosjun_erg_AA)]$label<-NA lo_community2 <- igraph::layout_with_fr(community2_graph) plot(community2_graph, layout=lo_community2)
Friedman, J., Hastie, T., & Tibshirani, R. (2008). Sparse inverse covariance estimation with the graphical lasso. Biostatistics, 9(3), 432--441. https://doi.org/10.1093/biostatistics/kxm045
Gross, S. M., & Tibshirani, R. (2015). Collaborative regression. Biostatistics, 16(2), 326--338. https://doi.org/10.1093/biostatistics/kxu047
Liu, H., Roeder, K., & Wasserman, L. (2010). Stability Approach to Regularization Selection (StARS) for High Dimensional Graphical Models (arXiv:1006.3316). arXiv. https://doi.org/10.48550/arXiv.1006.3316
Clauset, A., Newman, M. E. J., & Moore, C. (2004). Finding community structure in very large networks. Physical Review E, 70(6), 066111. https://doi.org/10.1103/PhysRevE.70.066111
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.