findSuggestedK: findSuggestedK

View source: R/findSuggestedK.R

findSuggestedKR Documentation

findSuggestedK

Description

Performs a grid search over a range of k values to assess cluster stability.

Usage

findSuggestedK(
  scCNA,
  embedding = "umap",
  ncomponents = 2,
  k_range = NULL,
  method = c("hdbscan", "leiden", "louvain"),
  metric = c("median", "mean"),
  seed = 17,
  B = 200,
  BPPARAM = bpparam()
)

hdbscanCBI(data, minPts, diss = inherits(data, "dist"), ...)

leidenCBI(data, k, seed_leid, diss = inherits(data, "dist"), ...)

louvainCBI(data, k, seed_leid, diss = inherits(data, "dist"), ...)

Arguments

scCNA

The CopyKit object.

embedding

String with the name of the reducedDim embedding.

ncomponents

An integer with the number of components dimensions to use from the embedding.

k_range

A numeric range of values to be tested.

method

A string with the method of clustering to be tested.

metric

A string with the function to summarize the jaccard similarity value from all clusters.

seed

A numerical scalar with a seed value to be passed on to umap.

B

A numeric with the number of bootstrapping iterations passed on to clusterboot. Higher values yield better results at a cost of performance

BPPARAM

A BiocParallelParam specifying how the function should be parallelized.

Details

Performs a grid-search over a range of k values and returns the value that maximizes the jaccard similarity. Importantly, while this approach does not guarantee optimal clustering, it provides a guide that maximizes cluster stability.

The default tested range is from 7 to the square root of the number of cells in the scCNA object. If sqrt(n_cells) is smaller than 7 a range of 5 to 15 is tested.

Value

Adds a table with the mean jaccard coefficient of clusters for each tested k and the suggested k value to be used for clustering to metadata

References

Hennig, C. (2007) Cluster-wise assessment of cluster stability. Computational Statistics and Data Analysis, 52, 258-271.

Hennig, C. (2008) Dissolution point and isolation robustness: robustness criteria for general cluster analysis methods. Journal of Multivariate Analysis 99, 1154-1176.

See Also

clusterboot

plotSuggestedK

Examples

set.seed(1000)
copykit_obj <- copykit_example_filtered()[,sample(300)]
copykit_obj <- findSuggestedK(copykit_obj)

navinlabcode/copykit documentation built on Sept. 22, 2023, 9:16 a.m.