cluster_cells: Cluster cells using Louvain/Leiden community detection

View source: R/cluster_cells.R

cluster_cellsR Documentation

Cluster cells using Louvain/Leiden community detection


Unsupervised clustering of cells is a common step in many single-cell expression workflows. In an experiment containing a mixture of cell types, each cluster might correspond to a different cell type. This function takes a cell_data_set as input, clusters the cells using Louvain/Leiden community detection, and returns a cell_data_set with internally stored cluster assignments. In addition to clusters this function calculates partitions, which represent superclusters of the Louvain/Leiden communities that are found using a kNN pruning method. Cluster assignments can be accessed using the clusters function and partition assignments can be accessed using the partitions function.


  reduction_method = c("UMAP", "tSNE", "PCA", "LSI", "Aligned"),
  k = 20,
  cluster_method = c("leiden", "louvain"),
  num_iter = 2,
  partition_qval = 0.05,
  weight = FALSE,
  resolution = NULL,
  random_seed = 42,
  verbose = FALSE,
  nn_control = list(),



The cell_data_set upon which to perform clustering.


The dimensionality reduction method upon which to base clustering. Options are "UMAP", "tSNE", "PCA" and "LSI".


Integer number of nearest neighbors to use when creating the k nearest neighbor graph for Louvain/Leiden clustering. k is related to the resolution of the clustering result, a bigger k will result in lower resolution and vice versa. Default is 20.


String indicating the clustering method to use. Options are "louvain" or "leiden". Default is "leiden". Resolution parameter is ignored if set to "louvain".


Integer number of iterations used for Louvain/Leiden clustering. The clustering result giving the largest modularity score will be used as the final clustering result. Default is 1. Note that if num_iter is greater than 1, the random_seed argument will be ignored for the louvain method.


Numeric, the q-value cutoff to determine when to partition. Default is 0.05.


A logical argument to determine whether or not to use Jaccard coefficients for two nearest neighbors (based on the overlapping of their kNN) as the weight used for Louvain clustering. Default is FALSE.


Parameter that controls the resolution of clustering. If NULL (Default), the parameter is determined automatically.


The seed used by the random number generator in louvain-igraph package. This argument will be ignored if num_iter is larger than 1.


A logic flag to determine whether or not we should print the run details.


An optional list of parameters used to make the nearest neighbor index. See the set_nn_control help for detailed information. The default metric is cosine for reduction_methods PCA, LSI, and Aligned, and is euclidean for reduction_methods tSNE and UMAP.


Additional arguments passed to the leidenbase package.


an updated cell_data_set object, with cluster and partition information stored internally and accessible using clusters and partitions


Rodriguez, A., & Laio, A. (2014). Clustering by fast search and find of density peaks. Science, 344(6191), 1492-1496. doi:10.1126/science.1242072

Vincent D. Blondel, Jean-Loup Guillaume, Renaud Lambiotte, Etienne Lefebvre: Fast unfolding of communities in large networks. J. Stat. Mech. (2008) P10008

V. A. Traag and L. Waltman and N. J. van Eck: From Louvain to Leiden: guaranteeing well-connected communities. Scientific Reports, 9(1) (2019). doi: 10.1038/s41598-019-41695-z.

Jacob H. Levine and et. al. Data-Driven Phenotypic Dissection of AML Reveals Progenitor-like Cells that Correlate with Prognosis. Cell, 2015.


    cell_metadata <- readRDS(system.file('extdata',
    gene_metadata <- readRDS(system.file('extdata',
    expression_matrix <- readRDS(system.file('extdata',
    cds <- new_cell_data_set(expression_data=expression_matrix,

    cds <- preprocess_cds(cds)
    cds <- reduce_dimension(cds)
    cds <- cluster_cells(cds)

cole-trapnell-lab/monocle3 documentation built on May 24, 2022, 5:25 p.m.