cluster_sim_spectrum: Calculate Cluster Similarity Spectrum

View source: R/generics.r

cluster_sim_spectrum.defaultR Documentation

Calculate Cluster Similarity Spectrum

Description

Calculate Cluster Similarity Spectrum (CSS), given expression of the data and cell labels used to distinguish samples. Clustering is applied to cells of each sample separately, similarities of one cell to those clusters are calculated and normalized.

Usage

## Default S3 method:
cluster_sim_spectrum(
  object,
  labels,
  cluster_labels = NULL,
  dr = NULL,
  dr_input = NULL,
  num_pcs_compute = 50,
  num_pcs_use = 20,
  redo_pca = FALSE,
  k = 20,
  min_batch_size = k * 2,
  ...,
  cluster_method = c("Seurat", "walktrap"),
  cluster_resolution = 0.6,
  min_cluster_num = 3,
  spectrum_type = c("corr_ztransform", "corr_kernel", "corr_raw", "nnet", "lasso"),
  corr_method = c("spearman", "pearson"),
  use_fast_rank = TRUE,
  lambda = 50,
  threads = 1,
  train_on = c("raw", "pseudo", "rand"),
  downsample_ratio = 1/10,
  k_pseudo = 10,
  logscale_likelihood = F,
  merge_spectrums = FALSE,
  merge_height_prop = 1/10,
  spectrum_dist_type = c("pearson", "euclidean"),
  spectrum_cl_method = "complete",
  return_css_only = T,
  verbose = T
)

## S3 method for class 'Seurat'
cluster_sim_spectrum(
  object,
  label_tag,
  cluster_col = NULL,
  var_genes = NULL,
  use_scale = F,
  use_dr = "pca",
  dims_use = 1:20,
  redo_pca = FALSE,
  redo_pca_with_data = FALSE,
  k = 20,
  min_batch_size = k * 2,
  ...,
  cluster_resolution = 0.6,
  spectrum_type = c("corr_ztransform", "corr_kernel", "corr_raw", "nnet", "lasso"),
  corr_method = c("spearman", "pearson"),
  lambda = 50,
  threads = 1,
  train_on = c("raw", "pseudo", "rand"),
  downsample_ratio = 1/10,
  k_pseudo = 10,
  logscale_likelihood = F,
  merge_spectrums = FALSE,
  merge_height_prop = 1/10,
  spectrum_dist_type = c("pearson", "euclidean"),
  spectrum_cl_method = "complete",
  reduction.name = "css",
  reduction.key = "CSS_",
  return_seuratObj = T,
  verbose = T
)

cluster_sim_spectrum(object, ...)

Arguments

object

An object

labels

Labels specifying different samples

cluster_labels

Use the provided clustering results instead of doing clustering per sample

dr

Dimension reduction matrix used for clustering. When it is NULL, truncated PCA is run on the expression matrix for dimension reduction

dr_input

Alternative expression matrix used for dimension reduction. Ignore if dr is specified

num_pcs_compute

Number of PCs to calculate. Ignore if dr is specified

num_pcs_use

Number of PCs used for clustering

redo_pca

If TRUE, PCA is rerun for each sample separately for clustering

k

Number of nearest neighbors of the kNN network used for clustering

min_batch_size

The minimal cell number of a batch to be clustered to generate references

...

Other parameters to build_knn_graph

cluster_method

Method used to apply clustering to the kNN network. By default it calls FindClusters in Seurat using Louvain method. Alternative method is the walktrap community identification algorithm in igraph

cluster_resolution

Resolution of clustering. Ignore if cluster_method is not Seurat

min_cluster_num

The minimal number of clusters to include a sample in the ref profile (default=3)

spectrum_type

Method to normalize similarities. "corr_ztransform" uses z-transform; "corr_kernel" introduces correlation kernel to convert similarities to likelihood; "corr_raw" uses no normalization; "nnet" and "lasso" build probabilistic prediction model on the data and estimate likelihoods

corr_method

Type of correlation. Ignore if spectrum_type is "nnet" or "lasso"

use_fast_rank

When the presto package is available, use its rank_matrix function to rank sparse matrix

lambda

Lambda in the correlation kernel

threads

Number of threads to use. Only useful if spectrum_type is "lasso"

train_on

Type of data used to train the likelihood model. Only useful if spectrum_type is "nnet" or "lasso"

downsample_ratio

Downsample rate. Only useful if train_on is "pseudo" or "rand"

k_pseudo

Number of nearest neighbors used to construct pseudocells. Only useful if train_on is "pseudo"

logscale_likelihood

If TRUE, estimated likelihoods are log-transformed. Ignore if spectrum_type is "corr_ztransform" or "corr_raw"

merge_height_prop

The height of dendrogram to cut. Ignore if merge_spectrum is FALSE

spectrum_dist_type

Type of distance to construct the dendrogram of spectrums. Ignore if merge_spectrum is FALSE

spectrum_cl_method

Method of hierarchical clustering to construct the dendrogram of spectrums. Ignore if merge_spectrum is FALSE

return_css_only

If FALSE, not only the calculated CSS matrix, but also other information to recalculate the spectrum is returned

verbose

If TRUE, progress message is provided

label_tag

Column in the meta.data slot showing sample labels

cluster_col

Column in the meta.data slot showing the cluster labels

var_genes

Genes used for similarity calculation. If NULL, predefined variable features are used

use_scale

If TRUE, scale.data rather than data slot is used for similarity calculation

use_dr

Name of reduction used for clustering

dims_use

Dimensions in the reduction used for clustering

redo_pca_with_data

If TRUE, data slot is used to redo PCA for each sample. Ignore if redo_pca is FALSE

reduction.name

Reduction name of the CSS representation in the returned Seurat object

reduction.key

Reduction key of the CSS representation in the returned Seurat object

return_seuratObj

If TRUE, a Seurat object with CSS added as one dimension reduction representation is returned. Otherwise, a list with CSS matrix and the calculation model is returned

merge_spectrum

If TRUE, similar similarity spectrums are averaged


quadbiolab/simspec documentation built on March 8, 2024, 11:59 p.m.