multi_nichenet_analysis: multi_nichenet_analysis

View source: R/pipeline.R

multi_nichenet_analysisR Documentation

multi_nichenet_analysis

Description

multi_nichenet_analysis Perform a MultiNicheNet analysis in an all-vs-all setting.

Usage

multi_nichenet_analysis(
sce, celltype_id, sample_id,group_id, batches, covariates, lr_network,ligand_target_matrix,contrasts_oi,contrast_tbl, senders_oi = NULL,receivers_oi = NULL, fraction_cutoff = 0.05, min_sample_prop = 0.5, scenario = "regular", ligand_activity_down = FALSE,
assay_oi_pb ="counts",fun_oi_pb = "sum",de_method_oi = "edgeR",min_cells = 10,logFC_threshold = 0.50,p_val_threshold = 0.05,p_val_adj = FALSE, empirical_pval = TRUE, top_n_target = 250, verbose = FALSE, n.cores = 1, return_lr_prod_matrix = FALSE, findMarkers = FALSE, top_n_LR = 2500)

Arguments

sce

SingleCellExperiment object of the scRNAseq data of interest. Contains both sender and receiver cell types.

celltype_id

Name of the column in the meta data of sce that indicates the cell type of a cell.

sample_id

Name of the meta data column that indicates from which sample/patient a cell comes from

group_id

Name of the meta data column that indicates from which group/condition a cell comes from

batches

NA if no batches should be corrected for. If there should be corrected for batches during DE analysis and pseudobulk expression calculation, this argument should be the name(s) of the columns in the meta data that indicate the batch(s). Should be categorical. Pseudobulk expression values will be corrected for the first element of this vector.

covariates

NA if no covariates should be corrected for. If there should be corrected for covariates uring DE analysis, this argument should be the name(s) of the columns in the meta data that indicate the covariate(s). Can both be categorical and continuous. Pseudobulk expression values will not be corrected for the first element of this vector.

lr_network

Prior knowledge Ligand-Receptor network (columns: ligand, receptor)

ligand_target_matrix

Prior knowledge model of ligand-target regulatory potential (matrix with ligands in columns and targets in rows). See https://github.com/saeyslab/nichenetr.

contrasts_oi

String indicating the contrasts of interest (= which groups/conditions will be compared) for the differential expression and MultiNicheNet analysis. We will demonstrate here a few examples to indicate how to write this. Check the limma package manuals for more information about defining design matrices and contrasts for differential expression analysis.
If wanting to compare group A vs B: ‘contrasts_oi = c("’A-B'")'
If wanting to compare group A vs B & B vs A: ‘contrasts_oi = c("’A-B','B-A'")'
If wanting to compare group A vs B & A vs C & A vs D: ‘contrasts_oi = c("’A-B','A-C', 'A-D'")'
If wanting to compare group A vs B and C: ‘contrasts_oi = c("’A-(B+C)/2'")'
If wanting to compare group A vs B, C and D: ‘contrasts_oi = c("’A-(B+C+D)/3'")'
If wanting to compare group A vs B, C and D & B vs A,C,D: ‘contrasts_oi = c("’A-(B+C+D)/3', 'B-(A+C+D)/3'")'
Note that the groups A, B, ... should be present in the meta data column 'group_id'.

contrast_tbl

Data frame providing names for each of the contrasts in contrasts_oi in the 'contrast' column, and the corresponding group of interest in the 'group' column. Entries in the 'group' column should thus be present in the group_id column in the metadata. Example for ‘contrasts_oi = c("’A-(B+C+D)/3', 'B-(A+C+D)/3'")': 'contrast_tbl = tibble(contrast = c("A-(B+C+D)/3","B-(A+C+D)/3"), group = c("A","B"))'

senders_oi

Default NULL: all celltypes will be considered as senders. If you want to select specific senders_oi: you can add this here as character vector.

receivers_oi

Default NULL: all celltypes will be considered as receivers. If you want to select specific receivers_oi: you can add this here as character vector.

fraction_cutoff

Cutoff indicating the minimum fraction of cells of a cell type in a specific sample that are necessary to consider a gene (e.g. ligand/receptor) as expressed in a sample.

min_sample_prop

Parameter to define the minimal required nr of samples in which a gene should be expressed in more than 'fraction_cutoff' of cells in that sample (per cell type). This nr of samples is calculated as the 'min_sample_prop' fraction of the nr of samples of the smallest group (after considering samples with n_cells >= 'min_cells'. Default: 'min_sample_prop = 0.50'. Examples: if there are 8 samples in the smallest group, there should be min_sample_prop*8 (= 4 in this example) samples with sufficient fraction of expressing cells.

scenario

Character vector indicating which prioritization weights should be used during the MultiNicheNet analysis. Currently 3 settings are implemented: "regular" (default), "lower_DE", and "no_frac_LR_expr". The setting "regular" is strongly recommended and gives each criterion equal weight. The setting "lower_DE" is recommended in cases your hypothesis is that the differential CCC patterns in your data are less likely to be driven by DE (eg in cases of differential migration into a niche). It halves the weight for DE criteria, and doubles the weight for ligand activity. "no_frac_LR_expr" is the scenario that will exclude the criterion "fraction of samples expressing the LR pair'. This may be beneficial in case of few samples per group.

ligand_activity_down

Default: FALSE, downregulatory ligand activity is not considered for prioritization. TRUE: both up- and downregulatory activity are considered for prioritization.

assay_oi_pb

Indicates which information of the assay of interest should be used (counts, scaled data,...). Default: "counts". See 'muscat::aggregateData'.

fun_oi_pb

Indicates way of doing the pseudobulking. Default: "sum". See 'muscat::aggregateData'.

de_method_oi

Indicates the DE method that will be used after pseudobulking. Default: "edgeR". See 'muscat::pbDS'.

min_cells

Indicates the minimal number of cells that a sample should have to be considered in the DE analysis. Default: 10. See 'muscat::pbDS'.

logFC_threshold

For defining the gene set of interest for NicheNet ligand activity: what is the minimum logFC a gene should have to belong to this gene set? Default: 0.25/

p_val_threshold

For defining the gene set of interest for NicheNet ligand activity: what is the maximam p-value a gene should have to belong to this gene set? Default: 0.05.

p_val_adj

For defining the gene set of interest for NicheNet ligand activity: should we look at the p-value corrected for multiple testing? Default: FALSE.

empirical_pval

For defining the gene set of interest for NicheNet ligand activity - and for ranking DE ligands and receptors: should we use the normal p-values, or the p-values that are corrected by the empirical null procedure. The latter could be beneficial if p-value distribution histograms indicate potential problems in the model definition (eg not all relevant batches are selected, etc). Default: TRUE.

top_n_target

For defining NicheNet ligand-target links: which top N predicted target genes. See 'nichenetr::get_weighted_ligand_target_links()'.

verbose

Indicate which different steps of the pipeline are running or not. Default: FALSE.

n.cores

The number of cores used for parallel computation of the ligand activities per receiver cell type. Default: 1 - no parallel computation.

return_lr_prod_matrix

Indicate whether to calculate a senderLigand-receiverReceptor matrix, which could be used for unsupervised analysis of the cell-cell communication. Default FALSE. Setting to FALSE might be beneficial to avoid memory issues.

findMarkers

Indicate whether we should also calculate DE results with the classic scran::findMarkers approach. Default (recommended): FALSE. if TRUE: both pseudobulk-based and cell-level based DE results will be generated.

top_n_LR

top nr of LR pairs for which correlation with target genes will be calculated. Is 2500 by default. If you want to calculate correlation for all expressed LR pairs, set this argument to NA.

Value

List containing information and output of the MultiNicheNet analysis. celltype_info: contains average expression value and fraction of each cell type - sample combination, celltype_de: contains output of the differential expression analysis, sender_receiver_info: links the expression information of the ligand in the sender cell types to the expression of the receptor in the receiver cell types, sender_receiver_de: links the differential information of the ligand in the sender cell types to the expression of the receptor in the receiver cell types ligand_activities_targets_DEgenes: contains the output of the NicheNet ligand activity analysis, and the NicheNet ligand-target inference prioritization_tables: contains the tables with the final prioritization scores lr_prod_mat: matrix of the ligand-receptor expression product of the expressed senderLigand-receiverReceptor pairs, grouping_tbl: data frame showing the group per sample lr_target_prior_cor: data frame showing the expression correlation between ligand-receptor pairs and DE genes + NicheNet regulatory potential scores indicating the amount of prior knowledge supporting a LR-target regulatory link

Examples

## Not run: 
library(dplyr)
lr_network = readRDS(url("https://zenodo.org/record/3260758/files/lr_network.rds"))
lr_network = lr_network %>% dplyr::rename(ligand = from, receptor = to) %>% dplyr::distinct(ligand, receptor)
ligand_target_matrix = readRDS(url("https://zenodo.org/record/3260758/files/ligand_target_matrix.rds"))
sample_id = "tumor"
group_id = "pEMT"
celltype_id = "celltype"
batches = NA
covariates = NA
contrasts_oi = c("'High-Low','Low-High'")
contrast_tbl = tibble(contrast = c("High-Low","Low-High"), group = c("High","Low"))
output = multi_nichenet_analysis(
     sce = sce, 
     celltype_id = celltype_id, 
     sample_id = sample_id, 
     group_id = group_id, 
     batches = batches,
     covariates = covariates,
     lr_network = lr_network, 
     ligand_target_matrix = ligand_target_matrix, 
     contrasts_oi = contrasts_oi, 
     contrast_tbl = contrast_tbl)

## End(Not run)


saeyslab/multinichenetr documentation built on Jan. 15, 2025, 7:55 p.m.