merge_clusters | R Documentation |
Join clusters representing the same expression pattern across cell types (redundant clusters). This function uses a metaclustering system (see details) and user-defined similarity thresholds that allows to control for the stringency of the merge process.
merge_clusters(
data,
isoform_col = NULL,
id_table,
cluster_list,
percentile_no = 10,
dynamic = FALSE,
method = c("percentile", "pearson", "spearman", "rho", "zi_kendall"),
height_cutoff = 0.2,
cutree_no = NULL,
...
)
data |
A data.frame or tibble object including isoforms as rows and cells as columns. Isoform IDs can be included as row names (data.frame) or as an additional column (tibble). |
isoform_col |
When a tibble is provided in |
id_table |
A data frame including two columns named |
cluster_list |
A list of character vectors, each containing the identifiers of the isoforms in a cluster. |
percentile_no |
Integer indicating the number of percentiles that will
be used to summarized cell type expression via |
dynamic |
A logical. If |
method |
Character indicating a co-expression method to use for merging
similar clusters. Should be one of |
height_cutoff |
When |
cutree_no |
An integer indicating the desired number of groups to
merge clusters into. Supplied to |
... |
Additional arguments passed to |
During the isoform clustering process, it is generally useful to
prioritize the reduction of within-cluster variability. This, however, can lead
to obtaining a large number of small, redundant clusters. To mitigate this
effect, acorde
includes a step where clusters with high profile similarity
can be merged using the correlation between their metatranscripts.
A cluster's metatranscript is calculated as the mean of the
percentile-summarized
expression of all of the isoforms in that cluster. Then, co-expression values
between metatranscripts are calculated and used to
generate a distance matrix to group cluster profiles by similarity, a process
that can be referred to as metaclustering.
By default, the metaclustering proccess is done using traditional
hierarchical clustering via hclust
,
which requires the definition of either a height cutoff (height_cutoff
parameter) or a number of clusters to obtain (cutree_no
).
Available co-expression metrics (selected via the method
) include:
percentile
: percentile correlations computed using
percentile_cor
.
pearson
: Pearson correlation computed using
cor
.
spearman
: Spearman correlation computed using
cor
.
zi_kendall
: zero-inflated Kendall correlation computed
using the dismay
function.
rho
: rho proportionality metric computed using the
dismay
function.
Alternatively, users may choose to perform metatranscript clustering
dynamically using the dynamicTreeCut
package, therefore setting dynamic = TRUE
.
In this case, additional parameters will need to be supplied to the
cutreeHybrid
function via the ...
argument.
Note that minClusterSize = 1
is set internally to allow clusters to
remain unmerged if no redundancies with the profiles of other clusters are
found.
A named list containing two elements:
merged_groups
: a list detailing merge decisions, in which
each element contains the identifiers of the clusters that were merged together.
clusters
: a list of character vectors, containing the
identifiers of isoforms included in each of the resulting clusters.
Langfelder2008acorde
\insertRefVenables2002acorde
\insertRefSkinnider2019acorde
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.