ondisc-package: ondisc: Algorithms and Data Structures for Large Single-Cell...

ondisc-packageR Documentation

ondisc: Algorithms and Data Structures for Large Single-Cell Expression Matrices

Description

Single-cell datasets are growing in size, posing challenges as well as opportunities for genomics researchers. 'ondisc' is an R package that facilitates analysis of large-scale single-cell data out-of-core on a laptop or distributed across tens to hundreds of processors on a cluster or cloud. In both of these settings, 'ondisc' requires only a few gigabytes of memory, even if the input data are tens of gigabytes in size. 'ondisc' mainly is oriented toward single-cell CRISPR screen analysis, but also can be used for single-cell differential expression and single-cell co-expression analyses. 'ondisc' is powered by several new, efficient algorithms for manipulating and querying large, sparse expression matrices. See Barry et al. (2024) \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1186/s13059-024-03254-2")}.

Author(s)

Maintainer: Timothy Barry tbarry@hsph.harvard.edu (ORCID)

Authors:

Other contributors:

  • Songcheng Dai [contributor]

  • Yixuan Qiu [contributor]

See Also

Useful links:

Examples

# initialize odm objects from Cell Ranger output; also, compute the cellwise covariates
directories_to_load <- file.path(
 system.file("extdata", "highmoi_example", package = "ondisc"),
 paste0("gem_group_", c(1, 2))
)
directory_to_write <- tempdir()
# Set data.table threads to 1 to pass CRAN example timing checks.
old_threads <- data.table::setDTthreads(1L)
out_list <- create_odm_from_cellranger(
  directories_to_load = directories_to_load,
  directory_to_write = directory_to_write,
)
data.table::setDTthreads(old_threads)

# extract the odm corresponding to the gene modality
gene_odm <- out_list$gene
gene_odm

# obtain dimension information
dim(gene_odm)
nrow(gene_odm)
ncol(gene_odm)

# obtain rownames (i.e., the feature IDs)
rownames(gene_odm) |> head()

# extract row into memory, first by integer and then by string
expression_vector_1 <- gene_odm[10,]
expression_vector_2 <- gene_odm[rownames(gene_odm)[10],]

# delete the gene_odm object
rm(gene_odm)

# reinitialize the gene_odm object
gene_odm <- initialize_odm_from_backing_file(
  paste0(tempdir(), "/gene.odm")
)
gene_odm

ondisc documentation built on June 17, 2026, 5:06 p.m.