CellDEEP.Kmean: K-means Based Cell Pooling for Seurat Objects

View source: R/01_Collection_of_poolingFunctions.R

CellDEEP.KmeanR Documentation

K-means Based Cell Pooling for Seurat Objects

Description

Pools cells into "pseudocells" by applying k-means clustering to PCA embeddings. This reduces data sparsity while maintaining the biological grouping of sample, cluster, and condition.

Usage

CellDEEP.Kmean(
  dataset,
  n_cells = 10,
  nstart = 100,
  assay_name = "RNA",
  readcounts = "mean",
  min_cells_per_subgroup = 25
)

Arguments

dataset

A Seurat object. Must have PCA reductions calculated.

n_cells

Integer. Target number of cells to pool into each pseudocell.

nstart

Integer. Number of random sets to start with in kmeans.

assay_name

Character. The assay to pull counts from (default "RNA").

readcounts

Character. Aggregation method: "mean" (rounded average), "sum", "10X" (mean * 10).

min_cells_per_subgroup

Integer. Minimum cells required in each sample-cluster subgroup to perform pooling (default 25).

Value

A new Seurat object where each "cell" is a pooled group of original cells.

Note

This function requires that PCA has already been run on the input dataset, as it uses the "pca" reduction for clustering.

Examples


data("sim")
pool_input <- prepare_data(
  sim,
  sample_id = "DonorID",
  group_id = "Status",
  cluster_id = "cluster_id"
)

pooled_kmean <- CellDEEP.Kmean(
  pool_input,
  readcounts = "sum",
  n_cells = 3,
  min_cells_per_subgroup = 1,
  assay_name = "RNA"
)
pooled_kmean


CellDEEP documentation built on March 29, 2026, 5:08 p.m.