Compiled: 27-02-2024
Source: vignettes/network_visualizations.Rmd
In this tutorial, we demonstrate several ways of visualizing the co-expression networks made with hdWGCNA. Before starting this tutorial, make sure that you have constructed the co-expression network as in the single-cell tutorial or the spatial transcriptomics. This tutorial covers three main network visualizations functions within hdWGCNA:
ModuleNetworkPlot
, visualizes a separate network plot for each module, showing the top 25 genes by kME.HubGeneNetworkPlot
, visualizes the network comprisng all modules with a given number of hub genes per module.ModuleUMAPPlot
, visualizes all of the genes in the co-expression simultaneously using the UMAP dimensionality reduction algorithm.Finally, we provide guidance for making custom network visualizations using the ggraph and tidygraph packages.
Before we visualize anything, we first need to load the data and the required libraries.
# single-cell analysis package library(Seurat) # plotting and data science packages library(tidyverse) library(cowplot) library(patchwork) library(magrittr) # co-expression network analysis packages: library(WGCNA) library(hdWGCNA) # network analysis & visualization package: library(igraph) # using the cowplot theme for ggplot theme_set(theme_cowplot()) # set random seed for reproducibility set.seed(12345) # load the Zhou et al snRNA-seq dataset seurat_obj <- readRDS('data/Zhou_control.rds')
Here we demonstrate using the ModuleNetworkPlot function to visualize the network underlying the top 25 hub genes for each module. By default, this function creates a new folder called "ModuleNetworks", and generates a .pdf figure for each module. There are a few parameters that you can adjust for this function:
ModuleNetworkPlot( seurat_obj, outdir = 'ModuleNetworks' )
Here we can see what one of these network plots looks like:
In this network, each node represents a gene, and each edge represents the co-expression
relationship between two genes in the network. Each of these module network plots are
colored based on the color
column in the hdWGCNA module assignment table
GetModules(seurat_obj)
. The top 10 hub genes by kME are placed in the center of
the plot, while the remaining 15 genes are placed in the outer circle.
Optionally, certain visualization parameters can be changed in this plot:
edge.alpha
: determines the opacity of the network edgesvertex.size
: determines the size of the nodesvertex.label.cex
: determines the font size of the gene labelHere we run this function again with some of the options changed to show more genes:
ModuleNetworkPlot( seurat_obj, outdir='ModuleNetworks2', # new folder name n_inner = 20, # number of genes in inner ring n_outer = 30, # number of genes in outer ring n_conns = Inf, # show all of the connections plot_size=c(10,10), # larger plotting area vertex.label.cex=1 # font size )
Here we will make a network plot combining all of the modules together using the
HubGeneNetworkPlot
function. This function takes the top n hub genes as specified
by the user, and other randomly selected genes, and constructs a joint network using
the force-directed graph drawing algorithm. For visual clarity, the number of edges in the network
can be downsampled using the edge_prop
parameter. In the following example, we
visualize the top 3 hub genes and 6 other genes per module.
# hubgene network HubGeneNetworkPlot( seurat_obj, n_hubs = 3, n_other=5, edge_prop = 0.75, mods = 'all' )
As in the previous network plot, each node represents a gene and each edge represents a
co-expression relationship. In this network, we color intramodular edges with the
module's color, and intermodular edges gray. The opacity of edges in this network
is scaled by the strength of the co-expression relationship. Additional network
layout settings can be passed to the layout_with_fr
function in igraph. The user can
also specify return_graph = TRUE
to return the igraph object to plot
using their own custom code.
g <- HubGeneNetworkPlot(seurat_obj, return_graph=TRUE)
Here we run HubGeneNetworkPlot
again, this time only selecting 5 specific modules:
# get the list of modules: modules <- GetModules(seurat_obj) mods <- levels(modules$module); mods <- mods[mods != 'grey'] # hubgene network HubGeneNetworkPlot( seurat_obj, n_hubs = 10, n_other=20, edge_prop = 0.75, mods = mods[1:5] # only select 5 modules )
Previously we visualized a subset of the co-expression network with an emphasis on the hub genes. Here, we use an alternative approach to visualize all genes in the co-expression network simultaneously. UMAP is a suitable method for visualizing high-dimensional data in two dimensions, and here we apply UMAP to embed the hdWGCNA network in a low-dimensional manifold.
hdWGCNA includes the function RunModuleUMAP
to run the UMAP algorithm on the
hdWGCNA topological overlap matrix (TOM).
For the UMAP analysis, we subset the columns in the TOM to only contain the top
n hub genes by kME for each module, as specified by the user. Therefore, the
organization of each gene in UMAP space is dependent on that gene's connectivity
with the network's hub genes. This function leverages the UMAP implementation from
the uwot
R package, so additional UMAP
parameters for the uwot::umap
function such as min_dist
or spread
can be included
in RunModuleUMAP
.
The following code demonstrates using the RunModuleUMAP
function with 10
hub genes per module:
seurat_obj <- RunModuleUMAP( seurat_obj, n_hubs = 10, # number of hub genes to include for the UMAP embedding n_neighbors=15, # neighbors parameter for UMAP min_dist=0.1 # min distance between points in UMAP space )
Next we will make a simple visualization of the UMAP using ggplot2:
# get the hub gene UMAP table from the seurat object umap_df <- GetModuleUMAP(seurat_obj) # plot with ggplot ggplot(umap_df, aes(x=UMAP1, y=UMAP2)) + geom_point( color=umap_df$color, # color each point by WGCNA module size=umap_df$kME*2 # size of each point based on intramodular connectivity ) + umap_theme()
In this plot, each point represents a single gene. The size of each dot is scaled
by the gene's kME for it's assigned module. ggplot2 is sufficient to visualize the
genes in the module UMAP, but here we are not visualizing the underlying network.
We can use the function ModuleUMAPPlot
to plot the genes and their co-expression
relationships.
ModuleUMAPPlot( seurat_obj, edge.alpha=0.25, sample_edges=TRUE, edge_prop=0.1, # proportion of edges to sample (20% here) label_hubs=2 ,# how many hub genes to plot per module? keep_grey_edges=FALSE )
This plot is similar to the one that we made with ggplot2, but we are showing
the co-expression network, and labeling 2 hub genes in each module. For visual clarity,
the we downsample to keep only 10% of the edges in this network using the edge_prop
parameter. We also allow the user to also return the igraph object to make their
own custom plots or to perform downstream network analysis:
g <- ModuleUMAPPlot(seurat_obj, return_graph=TRUE)
The number of hub genes that we include in the UMAP calculation influences the
downstream visualization. Here we use gganimate
to
visually compare the UMAPs that are computed with different numbers of hub genes.
Code
# different label weights to test n_hubs <- c(1, 1:10*5) # loop through different weights df <- data.frame() for(cur_hubs in n_hubs){ # make a module UMAP using different label weights seurat_obj <- RunModuleUMAP( seurat_obj, n_hubs = cur_hubs, n_neighbors=15, exclude_grey = TRUE, min_dist=0.1 ) # add to ongoing dataframe cur_df <- GetModuleUMAP(seurat_obj) cur_df$n_hubs <- cur_hubs df <- rbind(df, cur_df) } # ggplot animation library library(gganimate) # plot with ggplot + gganimate p <- ggplot(df, aes(x=UMAP1, y=UMAP2)) + geom_point(color=df$color, size=df$kME*2 ) + ggtitle("N hubs: {closest_state}") + transition_states( n_hubs, transition_length = 2, state_length = 2, wrap = TRUE ) + view_follow() + enter_fade() + umap_theme() animate(p, fps=30, duration=25)
# different label weights to test n_hubs <- c(1, 1:10*5) # loop through different weights df <- data.frame() for(cur_hubs in n_hubs){ # make a module UMAP using different label weights seurat_obj <- RunModuleUMAP( seurat_obj, n_hubs = cur_hubs, n_neighbors=15, exclude_grey = TRUE, min_dist=0.1 ) # add to ongoing dataframe cur_df <- GetModuleUMAP(seurat_obj) cur_df$n_hubs <- cur_hubs df <- rbind(df, cur_df) } save(df, file=paste0(data_dir, 'umap_animate1.rda')) # for both gganimates, we run the animation locally on the laptop library(tidyverse) library(gganimate) umap_theme <- function(){ theme( axis.line=element_blank(), axis.text.x=element_blank(), axis.text.y=element_blank(), axis.ticks=element_blank(), axis.title.x=element_blank(), axis.title.y=element_blank(), panel.background=element_blank(), panel.border=element_blank(), panel.grid.major=element_blank(), panel.grid.minor=element_blank(), plot.background=element_blank(), plot.title = element_text(hjust = 0.5) ) } #df <- read.csv("~/Desktop/animate_hub_umap.csv") load("~/Downloads/umap_animate1.rda") # plot with ggplot + gganimate p <- ggplot(df, aes(x=UMAP1, y=UMAP2)) + geom_point(color=df$color, size=df$kME*2 ) + ggtitle("N hubs: {closest_state}") + transition_states( n_hubs, transition_length = 2, state_length = 2, wrap = TRUE ) + view_follow() + enter_fade() + hdWGCNA::umap_theme() animate(p, fps=30, duration=25) anim_save("~/Downloads/hub_umap.gif")
This animation shows each of the UMAPs generated using different numbers of hub genes.
UMAP is often used as an unsupervised approach to project data points into a dimensionally-reduced spaced, but we can also supply UMAP with known labels to perform a supervised analysis. In principal, UMAP can better distinguish between different groups of data points in the embedding if the algorithm is aware of these groupings. Therefore, we allow the user to run a supervised UMAP using the RunModuleUMAP function, where each gene's module assignment is supplied as the label.
To perform a supervised UMAP analysis, we set supervised=TRUE
, and we can optionally
use the target_weight
parameter to determine how much influnce the labels will
have on the final embedding. A target_weight
closer to 0 weights
based on the structure of the data while a target_weight
closer to 1 weights
based on the labels. The following code shows how to run and visualize the supervised
UMAP:
# run supervised UMAP: seurat_obj <- RunModuleUMAP( seurat_obj, n_hubs = 10, n_neighbors=15, min_dist=0.1, supervised=TRUE, target_weight=0.5 ) # get the hub gene UMAP table from the seurat object umap_df <- GetModuleUMAP(seurat_obj) # plot with ggplot ggplot(umap_df, aes(x=UMAP1, y=UMAP2)) + geom_point( color=umap_df$color, # color each point by WGCNA module size=umap_df$kME*2 # size of each point based on intramodular connectivity ) + umap_theme()
To demonstrate what the supervised UMAP looks like using different
weights for the labels, we can make a different UMAP for several values of target_weight
and compare the outputs using gganimate
.
Code
# different label weights to test weights <- 0:10/10 # loop through different weights df <- data.frame() for(cur_weight in weights){ # make a module UMAP using different label weights seurat_obj <- RunModuleUMAP( seurat_obj, n_hubs = 10, n_neighbors=15, exclude_grey = TRUE, min_dist=0.3, supervised=TRUE, target_weight = cur_weight ) # add to ongoing dataframe cur_df <- GetModuleUMAP(seurat_obj) cur_df$weight <- cur_weight df <- rbind(df, cur_df) } # ggplot animation library library(gganimate) # plot with ggplot + gganimate p <- ggplot(df, aes(x=UMAP1, y=UMAP2)) + geom_point(color=df$color, size=df$kME*2 ) + ggtitle("Supervised weight: {closest_state}") + transition_states( weight, transition_length = 2, state_length = 2, wrap = TRUE ) + view_follow() + enter_fade() + umap_theme() animate(p, fps=30, duration=25)
# different label weights to test weights <- 0:10/10 # loop through different weights df <- data.frame() for(cur_weight in weights){ # make a module UMAP using different label weights seurat_obj <- RunModuleUMAP( seurat_obj, n_hubs = 10, n_neighbors=15, exclude_grey = TRUE, min_dist=0.3, supervised=TRUE, target_weight = cur_weight ) # add to ongoing dataframe cur_df <- GetModuleUMAP(seurat_obj) cur_df$weight <- cur_weight df <- rbind(df, cur_df) } save(df, file=paste0(data_dir, 'umap_animate2.rda')) # for both gganimates, we run the animation locally on the laptop load("~/Downloads/umap_animate2.rda") # plot with ggplot + gganimate p <- ggplot(df, aes(x=UMAP1, y=UMAP2)) + geom_point(color=df$color, size=df$kME*2 ) + ggtitle("Supervised weight: {closest_state}") + transition_states( weight, transition_length = 2, state_length = 2, wrap = TRUE ) + view_follow() + enter_fade() + hdWGCNA::umap_theme() animate(p, fps=30, duration=25) anim_save("~/Downloads/umap_animation.gif")
There are many different packages within R and beyond for making network visualizations aside from the functions that are included in hdWGCNA. Here we provide a guided example of creating a customized network plot using the ggraph and tidygraph R packages. Making a custom network plot to suit your specific needs will likely require writing custom code, so we consider this part an optional advanced topic.
In this example, our goal is to make a custom network plot showing only the genes that are involved in "nervous system development" based on Gene Ontology. We start with a simple example and we gradually different customization options, which is the typical process of making a customized network plot. In principal this approach can be applied to any custom list of genes.
First, we need to install ggraph and tidygraph. We also need to install fgsea in order to load the Gene Ontology lists from the EnrichR website.
# install fgsea: BiocManager::install('fgsea') # install tidygraph and ggraph install.packages(c("ggraph", "tidygraph")) # load the packages: library(ggraph) library(tidygraph)
Next we need to get the genes associated with nervous system development and subset our co-expression network by these genes. Please follow this link to download the GO Biological Process 2021 database from the EnrichR website.
# get modules and TOM from the seurat obj modules <- GetModules(seurat_obj) %>% subset(module != 'grey') %>% mutate(module = droplevels(module)) mods <- levels(modules$module) TOM <- GetTOM(seurat_obj) # get module colors for plotting mod_colors <- dplyr::select(modules, c(module, color)) %>% distinct() mod_cp <- mod_colors$color; names(mod_cp) <- as.character(mod_colors$module) # load the GO Biological Pathways file (donwloaded from EnrichR website) pathways <- fgsea::gmtPathways('GO_Biological_Process_2021.txt') # remove GO Term ID for simplicity: names(pathways) <- stringr::str_replace(names(pathways), " \\s*\\([^\\)]+\\)", "") # selected pathway cur_pathway <- 'nervous system development' # get genes in this pathway cur_genes <- pathways[[cur_pathway]] cur_genes <- cur_genes[cur_genes %in% modules$gene_name] # subset the TOM cur_TOM <- TOM[cur_genes,cur_genes]
We now have a co-expression network which only contains our 136 genes of
interest. Next we create a graph object with igraph
and tidygraph
, and then we make our network plot with ggraph
.
# set up the graph object with igraph & tidygraph graph <- cur_TOM %>% igraph::graph_from_adjacency_matrix(mode='undirected', weighted=TRUE) %>% tidygraph::as_tbl_graph(directed=FALSE) %>% tidygraph::activate(nodes) # make the plot with ggraph p <- ggraph(graph) + geom_edge_link(color='grey', alpha=0.2) + geom_node_point(color='black') + geom_node_label(aes(label=name), repel=TRUE, max.overlaps=Inf, fontface='italic') p
# create layout lay <- create_layout(graph, layout='sparse_stress', pivots=100) # add module info lay$module <- modules[lay$name, 'module'] # make the plot with gggraph p <- ggraph(lay) + geom_edge_link() + geom_node_point(aes(color=module)) + geom_node_label(aes(label=name), repel=TRUE, max.overlaps=Inf, fontface='italic') + scale_colour_manual(values=mod_cp) pathways <- fgsea::gmtPathways('/dfs7/swaruplab/smorabit/resources/Enrichr/GO_Biological_Process_2021.txt') png(paste0(fig_dir, 'custom_net1.png'), width=8, height=7, units='in', res=300) print(p) dev.off()
Here we have our first custom network plot showing 136 genes. Each node (dot) represents a gene, and the edges (lines) represent co-expression between two genes. There are a few problems with this plot that make it less meaningful.
Let's try to fix some of these problems. Next we will set the edges opacity (alpha) by the co-expression strength
using geom_edge_link
and we will color the nodes by each gene's assigned module using geom_node_point
.
With igrah
, we can use V(graph)
to set attributes for the network's nodes, in order to add the module
information to the graph.
# set up the graph object with igraph & tidygraph graph <- cur_TOM %>% igraph::graph_from_adjacency_matrix(mode='undirected', weighted=TRUE) %>% tidygraph::as_tbl_graph(directed=FALSE) %>% tidygraph::activate(nodes) # add the module name to the graph: V(graph)$module <- modules[V(graph)$name,'module'] # make the plot with gggraph p <- ggraph(graph) + geom_edge_link(aes(alpha=weight), color='grey') + geom_node_point(aes(color=module)) + geom_node_label(aes(label=name), repel=TRUE, max.overlaps=Inf, fontface='italic') + scale_colour_manual(values=mod_cp) p
png(paste0(fig_dir, 'custom_net2.png'), width=8, height=7, units='in', res=300) print(p) dev.off()
This plot is a little bit more informative than the last one, however there are still some issues.
In the previous plot, there are so many edges here that it is still hard to interpret the plot. There are also still too many genes labeled so the plot is very crowded. To solve these problems, we can make some simple manipulations to the graph to make the plots much more interpretable. Next we make a new plot where we reduce the number of edges by keeping only the strongest connections. We will also color the edges based on if the genes are in the same module, and we will only label genes if they are in the top 25 hub genes of each module.
# only keep the upper triangular part of the TOM: cur_TOM[upper.tri(cur_TOM)] <- NA # cast the network from wide to long format cur_network <- cur_TOM %>% reshape2::melt() %>% dplyr::rename(gene1 = Var1, gene2 = Var2, weight=value) %>% subset(!is.na(weight)) # get the module & color info for gene1 temp1 <- dplyr::inner_join( cur_network, modules %>% dplyr::select(c(gene_name, module, color)) %>% dplyr::rename(gene1 = gene_name, module1=module, color1=color), by = 'gene1' ) %>% dplyr::select(c(module1, color1)) # get the module & color info for gene2 temp2 <- dplyr::inner_join( cur_network, modules %>% dplyr::select(c(gene_name, module, color)) %>% dplyr::rename(gene2 = gene_name, module2=module, color2=color), by = 'gene2' ) %>% dplyr::select(c(module2, color2)) # add the module & color info cur_network <- cbind(cur_network, temp1, temp2) # set the edge color to the module's color if they are the two genes are in the same module cur_network$edge_color <- ifelse( cur_network$module1 == cur_network$module2, as.character(cur_network$module1), 'grey' ) # keep this network before subsetting cur_network_full <- cur_network # keep the top 10% of edges edge_percent <- 0.1 cur_network <- cur_network_full %>% dplyr::slice_max( order_by = weight, n = round(nrow(cur_network)*edge_percent) ) # make the graph object with tidygraph graph <- cur_network %>% igraph::graph_from_data_frame() %>% tidygraph::as_tbl_graph(directed=FALSE) %>% tidygraph::activate(nodes) # add the module name to the graph: V(graph)$module <- modules[V(graph)$name,'module'] # get the top 25 hub genes for each module hub_genes <- GetHubGenes(seurat_obj, n_hubs=25) %>% .$gene_name V(graph)$hub <- ifelse(V(graph)$name %in% hub_genes, V(graph)$name, "") # make the plot with gggraph p <- ggraph(graph) + geom_edge_link(aes(alpha=weight, color=edge_color)) + geom_node_point(aes(color=module)) + geom_node_label(aes(label=hub), repel=TRUE, max.overlaps=Inf, fontface='italic') + scale_colour_manual(values=mod_cp) + scale_edge_colour_manual(values=mod_cp) p
png(paste0(fig_dir, 'custom_net3.png'), width=8, height=7, units='in', res=300) print(p) dev.off()
We can also make a similar plot where we only keep edges between genes in the same module instead of just keeping the strongest edges.
# subset to only keep edges between genes in the same module cur_network <- cur_network_full %>% subset(module1 == module2) # make the graph object with tidygraph graph <- cur_network %>% igraph::graph_from_data_frame() %>% tidygraph::as_tbl_graph(directed=FALSE) %>% tidygraph::activate(nodes) # add the module name to the graph: V(graph)$module <- modules[V(graph)$name,'module'] # get the top 25 hub genes for each module hub_genes <- GetHubGenes(seurat_obj, n_hubs=25) %>% .$gene_name V(graph)$hub <- ifelse(V(graph)$name %in% hub_genes, V(graph)$name, "") # make the plot with gggraph p <- ggraph(graph) + geom_edge_link(aes(alpha=weight, color=edge_color)) + geom_node_point(aes(color=module)) + geom_node_label(aes(label=hub), repel=TRUE, max.overlaps=Inf, fontface='italic') + scale_colour_manual(values=mod_cp) + scale_edge_colour_manual(values=mod_cp) + NoLegend() p
png(paste0(fig_dir, 'custom_net4.png'), width=12, height=6, units='in', res=300) print(p) dev.off()
With the default settings, it seems that the layout of the nodes in this graph is making a very squished plot, partially because each of the modules are fully disconnected from one another.
Under the hood, ggraph
determines the 2D layout of the points
on the graph. Next we will try some different layouts other than the default.
Please consult this documentation page to learn more
about ggraph
layouts.
# randomly sample 50% of the edges within the same module cur_network1 <- cur_network_full %>% subset(module1 == module2) %>% group_by(module1) %>% sample_frac(0.5) %>% ungroup() # keep the top 10% of other edges edge_percent <- 0.10 cur_network2 <- cur_network_full %>% subset(module1 != module2) %>% dplyr::slice_max( order_by = weight, n = round(nrow(cur_network)*edge_percent) ) cur_network <- rbind(cur_network1, cur_network2) # set factor levels for edges: cur_network$edge_color <- factor( as.character(cur_network$edge_color), levels = c(mods, 'grey') ) # rearrange so grey edges are on the bottom: cur_network %<>% arrange(rev(edge_color)) # make the graph object with tidygraph graph <- cur_network %>% igraph::graph_from_data_frame() %>% tidygraph::as_tbl_graph(directed=FALSE) %>% tidygraph::activate(nodes) # add the module name to the graph: V(graph)$module <- modules[V(graph)$name,'module'] # get the top 25 hub genes for each module hub_genes <- GetHubGenes(seurat_obj, n_hubs=25) %>% .$gene_name V(graph)$hub <- ifelse(V(graph)$name %in% hub_genes, V(graph)$name, "") # 1. default layout p1 <- ggraph(graph) + geom_edge_link(aes(alpha=weight, color=edge_color)) + geom_node_point(aes(color=module)) + scale_colour_manual(values=mod_cp) + scale_edge_colour_manual(values=mod_cp) + ggtitle("layout = 'stress' (auto)") + NoLegend() # 2. Kamada Kawai (kk) layout graph2 <- graph; E(graph)$weight <- E(graph)$weight + 0.0001 p2 <- ggraph(graph, layout='kk', maxiter=100) + geom_edge_link(aes(alpha=weight, color=edge_color)) + geom_node_point(aes(color=module)) + scale_colour_manual(values=mod_cp) + scale_edge_colour_manual(values=mod_cp) + ggtitle("layout = 'kk'") + NoLegend() # 3. igraph layout_with_fr p3 <- ggraph(graph, layout=layout_with_fr(graph)) + geom_edge_link(aes(alpha=weight, color=edge_color)) + geom_node_point(aes(color=module)) + scale_colour_manual(values=mod_cp) + scale_edge_colour_manual(values=mod_cp) + ggtitle("layout_with_fr()") + NoLegend() # 4. igraph layout_as_tree p4 <- ggraph(graph, layout=layout_as_tree(graph)) + geom_edge_link(aes(alpha=weight, color=edge_color)) + geom_node_point(aes(color=module)) + scale_colour_manual(values=mod_cp) + scale_edge_colour_manual(values=mod_cp) + ggtitle("layout_as_tree()") + NoLegend() # 5. igraph layout_nicely p5 <- ggraph(graph, layout=layout_nicely(graph)) + geom_edge_link(aes(alpha=weight, color=edge_color)) + geom_node_point(aes(color=module)) + scale_colour_manual(values=mod_cp) + scale_edge_colour_manual(values=mod_cp) + ggtitle("layout_nicely()") + NoLegend() # 6. igraph layout_in_circle p6 <- ggraph(graph, layout=layout_in_circle(graph)) + geom_edge_link(aes(alpha=weight, color=edge_color)) + geom_node_point(aes(color=module)) + scale_colour_manual(values=mod_cp) + scale_edge_colour_manual(values=mod_cp) + ggtitle("layout_in_circle()") + NoLegend() # make a combined plot (p1 | p2 | p3) / (p4 | p5 | p6)
png(paste0(fig_dir, 'custom_net5.png'), width=12, height=8, units='in', res=300) (p1 | p2 | p3) / (p4 | p5 | p6) dev.off()
Here we show a combined plot of six different layouts of the same exact network. We can appreciate how different the results are when using different layout algorithms, and we encourage trying out different layouts which best suit your particular needs. The ggraph documentation page shows many more options for different graph layouts that you can use.
We can also provide a "custom" layout, which can be any set of 2D coordinates for each of the nodes. Here we show how we can use the co-expression UMAP layout that we computed previously. We also make a few additional modifications.
# get the UMAP df and subset by genes that are in our graph umap_df <- GetModuleUMAP(seurat_obj) umap_layout <- umap_df[names(V(graph)),] %>% dplyr::rename(c(x=UMAP1, y = UMAP2, name=gene)) rownames(umap_layout) <- 1:nrow(umap_layout) # create the layout lay <- ggraph::create_layout(graph, umap_layout) lay$hub <- V(graph)$hub p <- ggraph(lay) + geom_edge_link(aes(alpha=weight, color=edge_color)) + geom_node_point(data=subset(lay, hub == ''), aes(color=module, size=kME)) + geom_node_point(data=subset(lay, hub != ''), aes(fill=module, size=kME), color='black', shape=21) + scale_colour_manual(values=mod_cp) + scale_fill_manual(values=mod_cp) + scale_edge_colour_manual(values=mod_cp) + geom_node_label(aes(label=hub), repel=TRUE, max.overlaps=Inf, fontface='italic') + NoLegend() p
png(paste0(fig_dir, 'custom_net_umap.png'), width=6, height=6, units='in', res=300) p dev.off() png(paste0(fig_dir, 'custom_net_umap_density.png'), width=6, height=6, units='in', res=300) p dev.off()
We can easily modify this plot to add a background layer showing all of the other genes in our co-expression network. We will also change the shape of the selected genes to better differentiate them from the background.
p <- ggraph(lay) + ggrastr::rasterise( geom_point(inherit.aes=FALSE, data=umap_df, aes(x=UMAP1, y=UMAP2), color=umap_df$color, alpha=0.1, size=1), dpi=500 ) + geom_edge_link(aes(alpha=weight, color=edge_color)) + geom_node_point(data=subset(lay, hub == ''), aes(fill=module, size=kME), color='black', shape=21) + geom_node_point(data=subset(lay, hub != ''), aes(fill=module, size=kME), color='black', shape=23) + scale_colour_manual(values=mod_cp) + scale_fill_manual(values=mod_cp) + scale_edge_colour_manual(values=mod_cp) + geom_node_label(aes(label=hub), repel=TRUE, max.overlaps=Inf, fontface='italic') + NoLegend()
png(paste0(fig_dir, 'custom_net_umap2.png'), width=6, height=6, units='in', res=300) p dev.off()
In this plot, we show the hub genes as diamonds, non-hub genes as circles, and all genes not in our pathway of interest in the background.
Since ggraph
is based on ggplot2
, we can easily add further customizations. In this final
example, we change the gene label colors to match the assigned modules,
we change the plot background to black, and we add a title in the upper right corner.
p <- ggraph(lay) + ggrastr::rasterise( geom_point(inherit.aes=FALSE, data=umap_df, aes(x=UMAP1, y=UMAP2), color=umap_df$color, alpha=0.1, size=1), dpi=500 ) + geom_edge_link(aes(alpha=weight, color=edge_color)) + geom_node_point(data=subset(lay, hub == ''), aes(fill=module, size=kME), color='black', shape=21) + geom_node_point(data=subset(lay, hub != ''), aes(fill=module, size=kME), color='black', shape=23) + scale_colour_manual(values=mod_cp) + scale_fill_manual(values=mod_cp) + scale_edge_colour_manual(values=mod_cp) + geom_node_label(aes(label=hub, color=module), repel=TRUE, max.overlaps=Inf, fontface='italic') + NoLegend() p <- p + theme( panel.background = element_rect(fill='black'), plot.title = element_text(hjust=0.5) ) + annotate("text", x=Inf, y=Inf, label="Nervous system development", vjust=2, hjust=1.1, color='white', fontface='bold', size=5) p
png(paste0(fig_dir, 'custom_net_umap3.png'), width=6, height=6, units='in', res=300) p dev.off()
We encourage users to learn from this example, and to leverage these powerful plotting libraries to create custom network visualizations that suit their particular study.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.