expand_clusters: Expand clusters with unclustered isoforms

View source: R/cluster_refinement.R

expand_clustersR Documentation

Expand clusters with unclustered isoforms

Description

Assign unclustered isoforms based on their correlation with the average expression profile of previously-generated clusters.

Usage

expand_clusters(
  data,
  isoform_col = NULL,
  id_table,
  cluster_list,
  unclustered,
  percentile_no = 10,
  force_expand = TRUE,
  expand_threshold = NULL,
  allow_negative_cors = TRUE,
  method = c("percentile", "pearson", "spearman", "rho", "zi_kendall")
)

Arguments

data

A data.frame or tibble object including isoforms as rows and cells as columns. Isoform IDs can be included as row names (data.frame) or as an additional column (tibble).

isoform_col

When a tibble is provided in data, a character value indicating the name of the column in which isoform IDs are specified.

id_table

A data frame including two columns named cell and cell_type, in which correspondence between cell ID and cell type should be provided. The number of rows should be equal to the total number of cell columns in data, and the order of the cell column should match column (i.e. cell) order in data.

cluster_list

A list of character vectors, each containing the identifiers of the isoforms in a cluster.

unclustered

A character vector containing the identifiers of unclustered isoforms.

percentile_no

Integer indicating the number of percentiles that will be used to summarized cell type expression via percentile_expr. Should always be higher than 4 (quantiles) and lower than 100 (percentiles). Defaults to 10.

force_expand

Logical. When TRUE, forced expansion is enabled, and unclustered isoforms are assigned to clusters based on the maximum reported co-expression. If FALSE, a co-expression threshold needs to be specified via expand_threshold.

expand_threshold

A numeric value defining the minimum correlation required to assign an unclustered isoform to a cluster.

method

Character indicating a co-expression method to use for merging similar clusters. Should be one of percentile, pearson, spearman, zi_kendall, rho (see details). Percentile correlation is used by default.

allow_negative_cors.

Logical. If set to FALSE (default), negative correlations will not be considered. Defaults to TRUE.

Details

Correlation-based cluster expansion first requires cluster metatranscripts to be calculated. A cluster's metatranscript is calculated as the mean of the percentile-summarized expression of all of the isoforms in that cluster. Next, co-expression values between the metatranscripts and the unclustered isoforms are computed using the similarity metric specified in method.

Available co-expression metrics (selected via the method) include:

  1. percentile: percentile correlations computed using percentile_cor.

  2. pearson: Pearson correlation computed using cor.

  3. spearman: Spearman correlation computed using cor.

  4. zi_kendall: zero-inflated Kendall correlation computed using the dismay function.

  5. rho: rho proportionality metric computed using the dismay function.

Cluster expansion can be forced or threshold-based, depending on the values provided to force_expansion (i.e. TRUE or FALSE, respectively). If forced, isoforms will be assigned to a cluster if its metatranscript is the one yielding the highest co-expression with the isoform's expression, regardless of the exact co-expression metric value. Conversely, when threshold-based, only isoforms showing co-expression above the user-defined threshold (expand_threshold) with at least one metatranscript will be assigned to clusters. In this case, the maximally- correlated cluster is selected as the best match if there are several candidate clusters with above-threshold co-expression values.

Value

A list containing expanded clusters, where each element will include the identifiers of the isoforms assigned ot each cluster. If force_expand = FALSE, the first element of the list will contain the identifiers of isoforms that remained unassigned.

References

\insertRef

Skinnider2019acorde

\insertRef

Venables2002acorde


ConesaLab/acorde documentation built on Feb. 25, 2024, 4:16 a.m.