View source: R/compare_nhoods.R
calcNhoodSim | R Documentation |
Runs the neighbourhood comparison pipeline
calcNhoodSim( r_milo, m_milo, orthologs, sim_preprocessing = "gene_spec", sim_measure = "pearson", hvg_join_type = "intersection", r_exclude = NULL, m_exclude = NULL, export_dir = NULL, verbose = TRUE, ... )
r_milo |
The rabbit Milo object |
m_milo |
The mouse Milo object |
orthologs |
A DataFrame of rabbit and mouse one-to-one orthologs. |
sim_preprocessing |
Used to specify whether to compute the gene specificity
before computing similiarities between neighbourhoods. Can be either
|
sim_measure |
The similarity measure to compare neighbourhoods. Can be
one of |
hvg_join_type |
Specifies how to combine gene features from the two species. Can
be either |
r_exclude |
A vector of rabbit genes to exclude from feature selection. |
m_exclude |
A vector of mouse genes to exclude from feature selection. |
export_dir |
A string path indicating where to export output files. |
verbose |
A logical scalar indicating whether progress updates should be printed to screen. |
... |
Additional arguments to pass to specific methods.
See |
This function implements the neighbourhood comparison pipeline described in Ton et al. (2022).
The pipeline consists of two main steps - selecting features and computing neighbourhood similarities.
Feature selection
Fistly, to remove uninformative genes from driving similarities between species,
a subset of genes are chosen independently for each species by calling selectNhoodFeatures
.
By default this uses scran::getScranHVGs
, however, it's also possible to
specify a set of predefined features by passing a list of genes to the hvg_selection
argument.
These genes are then filtered to exclude genes that are listed in r_exclude
and m_exclude
.
Only genes that are one-to-one orthologs, as specified in orthologs
are retained.
The features from each species are then combined according to hvg_join_type
. This provides a
common gene set with which to compute similarities between neighbourhoods.
Computing neighbourhood similarities
Using these selected features, a mean expression profile is then computed for each neighbourhood
using calcNhoodMean
. These expression values are extracted using the logcounts
assay
of the rabbit and mouse Milo object which must represent normalised logcounts.
Prior to computing the similarity between neighbourhoods an additional normalisation
step can be performed using the sim_preprocessing = "gene_spec"
parameter option.
Specifically, the gene specificity (s^i_g) is computed for each neighbourhood, given by
s^i_g = N*g_i / (g_1 + g_2 + ... + g_N)
where g_x represents the mean expression of gene g in neighbourhood x = 1,...,N. This can be used to account for differences in absolute values between datasets.
Following this, the similarity between neighbourhoods is computed using the correlation
measure specified by sim_measure
.
The matrix of rabbit vs mouse neighbourhood similarity scores, as well as mean expression
values (or gene specificity values if sim_preprocessing = "gene_spcec"
) are exported
to the directory specified by export_dir
.
A named list is returned containing:
r_vals
: A DataFrame of mean expression values (or gene specificity values) for each rabbit gene across all rabbit neighbourhoods.
m_vals
: A DataFrame of mean expression values (or gene specificity values) for each mouse gene across all mouse neighbourhoods.
nhood_sim
: An all vs all matrix of similarities between rabbit (rows) and mouse (columns) neighbourhoods.
Daniel Keitley
Ton M.L, Keitley D et al. 2022 Rabbit Development as a Model for Single Cell Comparative Genomics Manuscript in submission.
selectNhoodFeatures
, calcNhoodMean
,
calcGeneSpec
and exportNhoodSim
for more info on
specific steps of the pipeline.
See an example usage here.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.