knitr::opts_chunk$set( collapse = TRUE, # comment = "#>", warning = FALSE, message = FALSE )
In this vignette, we will extend the basic NicheNet analysis analysis from Perform NicheNet analysis starting from a Seurat object: step-by-step analysis by incorporating gene expression as part of the prioritization This is a generalization of the Differential NicheNet and MultiNicheNet approach. While the original NicheNet only ranks ligands based on the ligand activity analysis, it is now also possible to prioritize ligands based on cell type and condition specificity of the ligand and receptor.
We will again make use of mouse NICHE-seq data to explore intercellular communication in the T cell area in the inguinal lymph node before and 72 hours after lymphocytic choriomeningitis virus (LCMV) infection [@medaglia_spatial_2017]. The used ligand-target matrix and the Seurat object of the processed NICHE-seq single-cell data can be downloaded from Zenodo.
Make sure you understand the different steps in a NicheNet analysis that are described in the basic vignette before proceeding with this vignette.
Load required packages, read in the Seurat object with processed expression data of interacting cells and NicheNet's ligand-target prior model, ligand-receptor network and weighted integrated networks.
library(nichenetr) # Please update to v2.0.6 library(Seurat) library(SeuratObject) library(tidyverse)
# Read Seurat object seuratObj <- readRDS(url("https://zenodo.org/record/3531889/files/seuratObj.rds")) seuratObj <- UpdateSeuratObject(seuratObj) seuratObj <- alias_to_symbol_seurat(seuratObj, "mouse") # Load in networks lr_network <- readRDS(url("https://zenodo.org/record/7074291/files/lr_network_mouse_21122021.rds")) ligand_target_matrix <- readRDS(url("https://zenodo.org/record/7074291/files/ligand_target_matrix_nsga2r_final_mouse.rds")) weighted_networks <- readRDS(url("https://zenodo.org/record/7074291/files/weighted_networks_nsga2r_final_mouse.rds")) lr_network <- lr_network %>% distinct(from, to)
We will use the sender-focused approach here.
# 1. Define set of potential ligands receiver <- "CD8 T" expressed_genes_receiver <- get_expressed_genes(receiver, seuratObj, pct = 0.05) sender_celltypes <- c("CD4 T", "Treg", "Mono", "NK", "B", "DC") list_expressed_genes_sender <- sender_celltypes %>% unique() %>% lapply(get_expressed_genes, seuratObj, 0.05) expressed_genes_sender <- list_expressed_genes_sender %>% unlist() %>% unique() all_ligands <- unique(lr_network$from) all_receptors <- unique(lr_network$to) expressed_ligands <- intersect(all_ligands, expressed_genes_sender) expressed_receptors <- intersect(all_receptors, expressed_genes_receiver) potential_ligands <- lr_network %>% filter(from %in% expressed_ligands & to %in% expressed_receptors) %>% pull(from) %>% unique() # 2. Define the gene set of interest condition_oi <- "LCMV" condition_reference <- "SS" seurat_obj_receiver <- subset(seuratObj, idents = receiver) DE_table_receiver <- FindMarkers(object = seurat_obj_receiver, ident.1 = condition_oi, ident.2 = condition_reference, group.by = "aggregate", min.pct = 0.05) %>% rownames_to_column("gene") geneset_oi <- DE_table_receiver %>% filter(p_val_adj <= 0.05 & abs(avg_log2FC) >= 0.25) %>% pull(gene) geneset_oi <- geneset_oi %>% .[. %in% rownames(ligand_target_matrix)] # 3. Define background genes background_expressed_genes <- expressed_genes_receiver %>% .[. %in% rownames(ligand_target_matrix)] # 4. Perform NicheNet ligand activity analysis ligand_activities <- predict_ligand_activities(geneset = geneset_oi, background_expressed_genes = background_expressed_genes, ligand_target_matrix = ligand_target_matrix, potential_ligands = potential_ligands) ligand_activities <- ligand_activities %>% arrange(-aupr_corrected) %>% mutate(rank = rank(desc(aupr_corrected)))
We will prioritize ligand-receptor pairs based on the following criteria (with their corresponding weight names):
de_ligand
de_receptor
exprs_ligand
exprs_receptor
ligand_condition_specificity
receptor_condition_specificity
This means that we will have to calculate:
We provide a wrapper function generate_info_tables
that will calculate all these values for you. This function returns a list with three dataframes:
sender_receiver_de
: differential expression of the ligand and receptor in the sender-receiver cell type pair. These were first calculated separately (i.e., DE of ligand in sender cell type, DE of receptor in receiver cell type based on FindAllMarkers) and then combined based on possible interactions from the lr_network. sender_receiver_info
: the average expression of the ligand and receptor in sender-receiver cell type pairslr_condition_de
: differential expression of the ligand and receptor between the two conditions across all cell types.Note that cell type specificity (i.e., the first four conditions) is calculated only in the condition of interest.
The "scenario" argument can be either "case_control" or "one_condition". In "case_control" scenario, condition specificity is calculated.
lr_network_filtered <- lr_network %>% filter(from %in% expressed_ligands & to %in% expressed_receptors) info_tables <- generate_info_tables(seuratObj, celltype_colname = "celltype", senders_oi = sender_celltypes, receivers_oi = receiver, lr_network = lr_network_filtered, condition_colname = "aggregate", condition_oi = condition_oi, condition_reference = condition_reference, scenario = "case_control") names(info_tables) info_tables$sender_receiver_de %>% head() info_tables$sender_receiver_info %>% head() info_tables$lr_condition_de %>% head()
Next, we generate the prioritization table. This table contains the rankings of ligand-receptor pairs based on the different criteria. We provide two scenarios: case_control
and one_condition
. In the "case_control" scenario, all weights are set to 1. If "one_condition", the weights are set to 0 for condition specificity and 1 for the remaining criteria. Users can also provide their own weights using the prioritizing_weights
argument.
prior_table <- generate_prioritization_tables(info_tables$sender_receiver_info, info_tables$sender_receiver_de, ligand_activities, info_tables$lr_condition_de, scenario = "case_control") prior_table %>% head
As you can see, the resulting table now show the rankings for ligand-receptor interactions of a sender-receiver cell type pair, instead of just the prioritized ligands. Cxcl10 now went up in the rankings due to both the high expression of its potential receptor Dpp4 and its high celltype specificity (scaled_lfc_ligand
). You can also see this in the visualizations further below.
We included all columns here, but if you just want relevant columns that were used to calculate the ranking:
prior_table %>% select(c('sender', 'receiver', 'ligand', 'receptor', 'scaled_p_val_adapted_ligand', 'scaled_p_val_adapted_receptor', 'scaled_avg_exprs_ligand', 'scaled_avg_exprs_receptor', 'scaled_p_val_adapted_ligand_group', 'scaled_p_val_adapted_receptor_group', 'scaled_activity'))
Note that we appended the suffix '_group' to columns that refer to differential expression between conditions, e.g., lfc_ligand_group
and lfc_receptor_group.
generate_info_tables
is a wrapper function that calculates all the information needed for the prioritization. However, in some cases you may need more flexibility on how these values are calculated (but note that you can pass extra arguments to generate_info_tables
that will get passed on to FindMarkers
, FindAllMarkers
, and AverageExpression
). Below, we show how we use helper functions calculate_de
and get_exprs_avg
to calculate the DE and get the average expression used for cell type specificity. process_table_to_ic
transforms these different dataframes so they are compatible with the generate_prioritization_tables
function.
# Only calculate DE for LCMV condition, with genes that are in the ligand-receptor network DE_table <- FindAllMarkers(subset(seuratObj, subset = aggregate == "LCMV"), min.pct = 0, logfc.threshold = 0, return.thresh = 1, features = unique(unlist(lr_network_filtered))) # Average expression information - only for LCMV condition expression_info <- get_exprs_avg(seuratObj, "celltype", condition_colname = "aggregate", condition_oi = condition_oi, features = unique(unlist(lr_network_filtered))) # Calculate condition specificity - only for datasets with two conditions! condition_markers <- FindMarkers(object = seuratObj, ident.1 = condition_oi, ident.2 = condition_reference, group.by = "aggregate", min.pct = 0, logfc.threshold = 0, features = unique(unlist(lr_network_filtered))) %>% rownames_to_column("gene") # Combine DE of senders and receivers -> used for prioritization processed_DE_table <- process_table_to_ic(DE_table, table_type = "celltype_DE", lr_network_filtered, senders_oi = sender_celltypes, receivers_oi = receiver) processed_expr_table <- process_table_to_ic(expression_info, table_type = "expression", lr_network_filtered) processed_condition_markers <- process_table_to_ic(condition_markers, table_type = "group_DE", lr_network_filtered)
And here is how you can define custom weights:
prioritizing_weights = c("de_ligand" = 1, "de_receptor" = 1, "activity_scaled" = 1, "exprs_ligand" = 1, "exprs_receptor" = 1, "ligand_condition_specificity" = 1, "receptor_condition_specificity" = 1)
prior_table <- generate_prioritization_tables(processed_expr_table, processed_DE_table, ligand_activities, processed_condition_markers, prioritizing_weights) prior_table %>% head
As NicheNet is a receiver-based pipeline, to prioritize ligand-receptor pairs across multiple receivers, we need to perform the NicheNet analysis for each receiver separately. Let's suppose we want to prioritize ligand-receptor pairs across all T cells (CD4, CD8, and Tregs). The CD8 T analysis has already been performed above. We will use the wrapper function to perform a basic NicheNet analysis on the other two:
nichenet_outputs <- lapply(c("CD8 T", "CD4 T", "Treg"), function(receiver_ct){ output <- nichenet_seuratobj_aggregate(receiver = receiver_ct, seurat_obj = seuratObj, condition_colname = "aggregate", condition_oi = condition_oi, condition_reference = condition_reference, sender = sender_celltypes, ligand_target_matrix = ligand_target_matrix, lr_network = lr_network, weighted_networks = weighted_networks, expression_pct = 0.05) # Add receiver cell type in ligand activity table output$ligand_activities$receiver <- receiver_ct return(output) })
To generate the dataframes used for prioritization, we will simply change the lr_network_filtered
argument to only calculate DE and expression values for ligand-receptor pairs of interest.
# Calculate prioritization criteria for each receiver cell type info_tables <- lapply(nichenet_outputs, function(output) { lr_network_filtered <- lr_network %>% select(from, to) %>% filter(from %in% output$ligand_activities$test_ligand & to %in% output$background_expressed_genes) generate_info_tables(seuratObj, celltype_colname = "celltype", senders_oi = sender_celltypes, receivers_oi = unique(output$ligand_activities$receiver), lr_network_filtered = lr_network_filtered, condition_colname = "aggregate", condition_oi = condition_oi, condition_reference = condition_reference, scenario = "case_control") })
We can then combine the results from generate_info_tables
using bind_rows
, which will concatenate the rows together. Note that for the average expression table (sender_receiver_info
) and condition specificity (lr_condition_de
), we need to remove duplicate rows.
# bind rows of each element of info_tables using pmap info_tables_combined <- purrr::pmap(info_tables, bind_rows) ligand_activities_combined <- purrr::map_dfr(nichenet_outputs, "ligand_activities") prior_table_combined <- generate_prioritization_tables( sender_receiver_info = info_tables_combined$sender_receiver_info %>% distinct, sender_receiver_de = info_tables_combined$sender_receiver_de, ligand_activities = ligand_activities_combined, lr_condition_de = info_tables_combined$lr_condition_de %>% distinct, scenario = "case_control") head(prior_table_combined)
In addition to the usual heatmap visualizations, we provide a function make_circos_lr
to visualize the ligand-receptor pairs in a circos plot. This was originally written for the (now deprecated) Differential NicheNet vignettes. The function takes in a prioritization table and a named vector for the color of senders and receivers. We first specify the number of top ligand-receptor pairs to show with n
.
# Get top n ligand-receptor pairs prior_table_oi <- prior_table_combined %>% slice_max(prioritization_score, n = 50) # Define colors for senders and receivers senders_receivers <- prior_table_oi %>% select(sender, receiver) %>% unlist %>% unique %>% sort celltype_colors <- RColorBrewer::brewer.pal(length(senders_receivers), name = 'Set3') %>% magrittr::set_names(senders_receivers) circos_plot <- make_circos_lr(prior_table_oi, colors_sender = celltype_colors, colors_receiver = celltype_colors)
circos_plot
Furthermore, we provide the function make_mushroom_plot
which allows you to display expression of ligand-receptor pairs in a specific receiver. By default, the fill gradient shows the LFC between cell types, while the size of the semicircle corresponds to the scaled mean expression. You can also choose to show the rankings of each ligand-receptor-sender pair with show_rankings
, as well as show all data points for context (show_all_datapoints
). true_color_range = TRUE
will adjust the limits of the color gradient to the min-max of the values, instead of the limit being from 0 to 1. Note that the numbers displayed here are the rankings across all receiver cell types (in case of multiple receivers), and by default the top_n
ligand-receptor pairs are shown despite the absolute ranking. To show only pairs that have an absolute ranking within top_n across all receivers, set use_absolute_rank = TRUE
.
receiver_oi <- "CD8 T" legend_adjust <- c(0.7, 0.7) make_mushroom_plot(prior_table_combined %>% filter(receiver == receiver_oi), top_n = 30, true_color_range = TRUE, show_rankings = TRUE, show_all_datapoints = TRUE) + theme(legend.justification = legend_adjust, axis.title.x = element_text(hjust = 0.25))
Furthermore, you can change the "size" and "fill" values to certain columns from the prioritization table (those with the _ligand
or _receptor
suffix).
print(paste0("Column names that you can use are: ", paste0(prior_table %>% select(ends_with(c("_ligand", "_receptor", "_sender", "_receiver"))) %>% colnames() %>% str_remove("_ligand|_receptor|_sender|_receiver") %>% unique, collapse = ", "))) # Change size and color columns make_mushroom_plot(prior_table, top_n = 30, size = "pct_expressed", color = "scaled_avg_exprs") + theme(legend.justification = legend_adjust, axis.title.x = element_text(hjust = 0.25))
sessionInfo()
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.