StoeckiusHashingData: Obtain the Stoeckius cell hashing data

View source: R/StoeckiusHashingData.R

StoeckiusHashingDataR Documentation

Obtain the Stoeckius cell hashing data

Description

Obtain the (mostly human) cell hashing single-cell RNA-seq data from Stoeckius et al. (2018).

Usage

StoeckiusHashingData(
  type = c("pbmc", "mixed"),
  mode = NULL,
  ensembl = FALSE,
  location = TRUE,
  strip.metrics = TRUE,
  legacy = FALSE
)

Arguments

type

String specifying the dataset to obtain.

mode

String specifying the data modalities to obtain, see Details.

ensembl

Logical scalar indicating whether the output row names should contain Ensembl identifiers.

location

Logical scalar indicating whether genomic coordinates should be returned.

strip.metrics

Logical scalar indicating whether quality control metrics should be removed from the HTO/ADT counts.

legacy

Logical scalar indicating whether to pull data from ExperimentHub. By default, we use data from the gypsum backend.

Details

When type="pbmc", the mode can be one or more of:

  • "human", the RNA counts for human genes.

  • "mouse", the RNA counts for mouse genes. Present as the PBMC dataset is actually a mixture of human PBMCs and unlabelled mouse cells.

  • "hto", the HTO counts.

  • "adt1", counts for the first set of ADTs (immunoglobulin controls).

  • "adt2", counts for the second set of ADTs (cell type-specific markers).

If mode=NULL, the default is to use "human", "mouse" and "hto".

When type="mixed", the mode can be one or more of:

  • "rna", the RNA counts for the genes;

  • "hto", the HTO counts.

If mode=NULL, the default is to use "rna" and "hto".

If ensembl=TRUE, gene symbols for the RNA counts are converted to Ensembl IDs in the row names of the output object. Rows with missing Ensembl IDs are discarded, and only the first occurrence of duplicated IDs is retained.

If location=TRUE, the coordinates of the Ensembl gene models are stored in the rowRanges of the output. Note that this is only performed if ensembl=TRUE and only for the RNA counts.

For the HTO and ADT matrices, some rows correspond to quality control metrics. If strip.metrics=TRUE, these rows are removed so that only data for actual HTOs or ADTs are present.

All data are downloaded from ExperimentHub and cached for local re-use. Specific resources can be retrieved by searching for scRNAseq/nestorowa-hsc.

Value

A SingleCellExperiment object with a matrix of UMI counts corresponding to the first mode, plus any number of alternative Experiments containing the remaining modes. If multiple modes are specified, the output object only contains the intersection of their column names.

Author(s)

Aaron Lun

References

Stoeckius et al. (2018). Cell Hashing with barcoded antibodies enables multiplexing and doublet detection for single cell genomics. Genome Biol. 19, 224.

Examples

sce.pbmc <- StoeckiusHashingData()
sce.pbmc

sce.mixed <- StoeckiusHashingData(type="mixed")
sce.mixed


LTLA/scRNAseq documentation built on June 28, 2024, 7:31 p.m.