selectBatchHVG: [Experimental] Batch-aware highly variable gene selection
In rliger: Linked Inference of Genomic Experimental Relationships

selectBatchHVG

R Documentation

Batch-aware highly variable gene selection

Description

Method to select HVGs based on mean dispersions of genes that are highly variable genes in all batches. Using a the top target_genes per batch by average normalize dispersion. If target genes still hasn't been reached, then HVGs in all but one batches are used to fill up. This is continued until HVGs in a single batch are considered.

This is an rliger implementation of the method originally published in SCIB. We found the potential that it can improve integration under some circumstances, and is currently testing it.

This function currently only works for shared features across all datasets. For selection from only part of the datasets and selection for dataset-specific unshared features, please use selectGenes().

Usage

selectBatchHVG(object, ...)

## S3 method for class 'liger'
selectBatchHVG(
  object,
  nGenes = 2000,
  verbose = getOption("ligerVerbose", TRUE),
  ...
)

## S3 method for class 'ligerDataset'
selectBatchHVG(
  object,
  nGenes = 2000,
  features = NULL,
  scaleFactor = NULL,
  verbose = getOption("ligerVerbose", TRUE),
  ...
)

## S3 method for class 'dgCMatrix'
selectBatchHVG(
  object,
  nGenes = 2000,
  returnStats = FALSE,
  scaleFactor = NULL,
  verbose = getOption("ligerVerbose", TRUE),
  ...
)

## S3 method for class 'DelayedArray'
selectBatchHVG(
  object,
  nGenes = 2000,
  means = NULL,
  scaleFactor = NULL,
  returnStats = FALSE,
  chunk = getOption("ligerChunkSize", 20000),
  verbose = getOption("ligerVerbose", TRUE),
  ...
)

Arguments

`object`	A `liger` object, `ligerDataset` object or a sparse/dense matrix. The liger objects must have raw counts available. A direct matrix input is preferably log-1p transformed from CPM normalized counts in cell per column orientation.
`...`	Arguments passed to S3 methods.
`nGenes`	Integer number of target genes to select. Default `2000`.
`verbose`	Logical. Whether to show a progress bar. Default `getOption("ligerVerbose")` or `TRUE` if users have not set.
`features`	For ligerDataset method, the feature subset to limit the selection to, mainly for limiting the selection to happen within the shared genes of all datasets. Default `NULL` selects from all features in the ligerDataset object.
`scaleFactor`	Numeric vector of scaling factor to normalize the raw counts to unit sum. This pre-calculated at liger object creation (stored as `object$nUMI` and internally specified in S3 method chains, thus is generally not needed to be specified by users.
`returnStats`	Logical, for dgCMatrix-method, whether to return a data frame of statistics for all features, or by default `FALSE` just return a character vector of selected features.
`means`	Numeric vector of pre-calculated means per gene, derived from log1p CPM normalized expression.
`chunk`	Integer. Number of maximum number of cells in each chunk when working on HDF5Array Default `20000`.

Value

liger-method: Returns the input liger object with the selected genes updated in varFeatures slot, which can be accessed with varFeatures(object). Additionally, the statistics are updated in the featureMeta slot of each ligerDataset object within the datasets slot of the object.
ligerDataset-method: Returns the input ligerDataset object with the statistics updated in the featureMeta slot.
dgCMatrix-method: By default returns a character vector of selected variable features. If returnStats = TRUE, returns a data.frame of the statistics.

References

Luecken, M.D., Büttner, M., Chaichoompu, K. et al. (2022), Benchmarking atlas-level data integration in single-cell genomics. Nat Methods, 19, 41–50. https://doi.org/10.1038/s41592-021-01336-8.

Examples

pbmc <- selectBatchHVG(pbmc, nGenes = 10)
varFeatures(pbmc)

rliger documentation built on Aug. 27, 2025, 1:08 a.m.

rliger index

Package overview README.md Data integration with LIGER

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

rliger
Linked Inference of Genomic Experimental Relationships

selectBatchHVG: [Experimental] Batch-aware highly variable gene selection
In rliger: Linked Inference of Genomic Experimental Relationships

Batch-aware highly variable gene selection

Description

Usage

Arguments

Value

References

See Also

Examples

Related to selectBatchHVG in rliger...

R Package Documentation

Browse R Packages

We want your feedback!

rliger Linked Inference of Genomic Experimental Relationships

selectBatchHVG: *[Experimental]* Batch-aware highly variable gene selection In rliger: Linked Inference of Genomic Experimental Relationships

Batch-aware highly variable gene selection

Description

Usage

Arguments

Value

References

See Also

Examples

Related to selectBatchHVG in rliger...

R Package Documentation

Browse R Packages

We want your feedback!

rliger
Linked Inference of Genomic Experimental Relationships

selectBatchHVG: [Experimental] Batch-aware highly variable gene selection
In rliger: Linked Inference of Genomic Experimental Relationships