findClusters: Find Clusters

View source: R/findClusters.R

findClustersR Documentation

Find Clusters


Search for clusters in the scCNA data.


  embedding = "umap",
  ncomponents = 2,
  method = c("hdbscan", "leiden", "louvain"),
  k_superclones = NULL,
  k_subclones = NULL,
  seed = 17



scCNA object.


String with the name of the reducedDim to pull data from.


An integer with the number of components dimensions to use from the embedding.


A string with method used for clustering.


A numeric k-nearest-neighbor value. Used to find the superclones.


A numeric k-nearest-neighbor value. Used to find the subclones


A numeric passed on to pseudo-random dependent functions.


findClusters uses the reduced dimensional embedding resulting from runUmap to perform clustering at two levels, hereby referred to as superclones, and subclones. When clustering for superclones findClusters creates a graph representation of the dataset reduced dimension embedding using a shared nearest neighbor algorithm (SNN) makeSNNGraph, from this graph the connected components are extracted and generally represent high-level structures that share large, lineage defining copy number events. At a more fine-grained resolution, CopyKit can also be used to detect subclones, i. e. groups of cells containing a unique copy number event per cluster, to do so the umap embedding is again used as the pre-processing step, this time to perform a density-based clustering with hdbscan hdbscan. Network clustering algorithms on top of the SNN graph such as the leiden algorithm leiden_find_partition.

  • hdbscan: hdbscan is an outlier aware clustering algorithm, since extensive filtering of the dataset can be applied before clustering with findOutliers, any cell classified as an outlier is inferred to the same cluster group as its closest, non-outlier, nearest-neighbor according to Euclidean distance.


Cluster information is added to colData in columns superclones or subclones. Superclones are prefixed by 's' whereas subclones are prefixed by 'c'.


Darlan Conterno Minussi


Laks, E., McPherson, A., Zahn, H., et al. (2019). Clonal Decomposition and DNA Replication States Defined by Scaled Single-Cell Genome Sequencing. Cell, 179(5), 1207–1221.e22.

Leland McInnes and John Healy and James Melville. UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction. arXiv:1802.03426

Lun ATL, McCarthy DJ, Marioni JC (2016). “A step-by-step workflow for low-level analysis of single-cell RNA-seq data with Bioconductor.” F1000Res., 5, 2122. doi: 10.12688/f1000research.9501.2.

See Also


hdbscan For hdbscan clustering.


copykit_obj <- copykit_example_filtered()
copykit_obj <- findClusters(copykit_obj)

navinlabcode/copykit documentation built on Sept. 22, 2023, 9:16 a.m.