SCimplify: Detection of metacells with the SuperCell approach

View source: R/SCimplify.R

SCimplifyR Documentation

Detection of metacells with the SuperCell approach

Description

This function detects metacells (former super-cells) from single-cell gene expression matrix

Usage

SCimplify(
  X,
  genes.use = NULL,
  genes.exclude = NULL,
  cell.annotation = NULL,
  cell.split.condition = NULL,
  n.var.genes = min(1000, nrow(X)),
  gamma = 10,
  k.knn = 5,
  do.scale = TRUE,
  n.pc = 10,
  fast.pca = TRUE,
  do.approx = FALSE,
  approx.N = 20000,
  block.size = 10000,
  seed = 12345,
  igraph.clustering = c("walktrap", "louvain"),
  return.singlecell.NW = TRUE,
  return.hierarchical.structure = TRUE,
  ...
)

Arguments

X

log-normalized gene expression matrix with rows to be genes and cols to be cells

genes.use

a vector of genes used to compute PCA

genes.exclude

a vector of genes to be excluded when computing PCA

cell.annotation

a vector of cell type annotation, if provided, metacells that contain single cells of different cell type annotation will be split in multiple pure metacell (may result in slightly larger numbe of metacells than expected with a given gamma)

cell.split.condition

a vector of cell conditions that must not be mixed in one metacell. If provided, metacells will be split in condition-pure metacell (may result in significantly(!) larger number of metacells than expected)

n.var.genes

if "genes.use" is not provided, "n.var.genes" genes with the largest variation are used

gamma

graining level of data (proportion of number of single cells in the initial dataset to the number of metacells in the final dataset)

k.knn

parameter to compute single-cell kNN network

do.scale

whether to scale gene expression matrix when computing PCA

n.pc

number of principal components to use for construction of single-cell kNN network

fast.pca

use irlba as a faster version of prcomp (one used in Seurat package)

do.approx

compute approximate kNN in case of a large dataset (>50'000)

approx.N

number of cells to subsample for an approximate approach

block.size

number of cells to map to the nearest metacell at the time (for approx coarse-graining)

seed

seed to use to subsample cells for an approximate approach

igraph.clustering

clustering method to identify metacells (available methods "walktrap" (default) and "louvain" (not recommended, gamma is ignored)).

return.singlecell.NW

whether return single-cell network (which consists of approx.N if "do.approx" or all cells otherwise)

return.hierarchical.structure

whether return hierarchical structure of metacell

...

other parameters of build_knn_graph function

Value

a list with components

  • graph.supercells - igraph object of a simplified network (number of nodes corresponds to number of metacells)

  • membership - assigmnent of each single cell to a particular metacell

  • graph.singlecells - igraph object (kNN network) of single-cell data

  • supercell_size - size of metacells (former super-cells)

  • gamma - requested graining level

  • N.SC - number of obtained metacells

  • genes.use - used genes

  • do.approx - whether approximate coarse-graining was perfirmed

  • n.pc - number of principal components used for metacells construction

  • k.knn - number of neighbors to build single-cell graph

  • sc.cell.annotation. - single-cell cell type annotation (if provided)

  • sc.cell.split.condition. - single-cell split condition (if provided)

  • SC.cell.annotation. - super-cell cell type annotation (if was provided for single cells)

  • SC.cell.split.condition. - super-cell split condition (if was provided for single cells)

Examples


data(cell_lines) # list with GE - gene expression matrix (logcounts), meta - cell meta data
GE <- cell_lines$GE

SC <- SCimplify(GE,  # log-normalized gene expression matrix
                gamma = 20, # graining level
                n.var.genes = 1000,
                k.knn = 5, # k for kNN algorithm
                n.pc = 10, # number of principal components to use
                do.approx = TRUE) #



SuperCell documentation built on Oct. 25, 2024, 5:07 p.m.