countCells: Count cells in high-dimensional space

View source: R/countCells.R

countCellsR Documentation

Count cells in high-dimensional space

Description

Count the number of cells from each sample lying inside hyperspheres in high-dimensional space.

Usage

countCells(
  prepared,
  tol = 0.5,
  num.threads = 1,
  BPPARAM = SerialParam(),
  downsample = 10,
  filter = 10
)

Arguments

prepared

A List object produced by prepareCellData.

tol

A numeric scalar to be used as the scaling factor for the hypersphere radius.

num.threads

Integer scalar specifying the number of threads to use.

BPPARAM

A BiocParallelParam object specifying how parallelization is to be performed in findNeighbors.

downsample

An integer scalar specifying the frequency with which cells are sampled to form hyperspheres.

filter

An integer scalar specifying the minimum count sum required to report a hypersphere.

Details

Consider that each cell defines a point in M-dimensional space (where M is the number of markers), based on its marker intensities. This function constructs hyperspheres and counts the number of cells from each sample lying within each hypersphere. In this manner, the distribution of cells across the space can be quantified. For each hypersphere, cell counts for all samples are reported along with the median intensity across the counted cells for each marker.

Each hypersphere is centered on a cell to ensure that only occupied spaces are counted. However, for high-density spaces, this can result in many redundant hyperspheres. To reduce computational work, only a subset of cells are used to define hyperspheres. The downsampling frequency is specified by downsample, e.g., only every 10th cell is used to make a hypersphere by default.

Each hypersphere also has a radius of tol*sqrt(M) (this relationship avoids loss of counts as M increases). tol can be interpreted as the acceptable amount of deviation in the intensity of a single marker for a given subpopulation. The default value of 0.5 means that, for any one marker, cells with +0.5 or -0.5 intensity will be counted into the same subpopulation. This value is sensible as intensities are usually on a log-10 scale, such that a total of 10-fold variability in marker intensities is tolerated.

The coordinates are reported as (weighted) medians across all cells in each hypersphere. Compared to the center, the median better reflects the location of the hypersphere if the cells are not distributed around the centre. Each cell is weighted inversely proportional to the total number of cells in the corresponding sample. This ensures that large samples do not dominate the median calculation.

All hyperspheres with count sums below filter are removed by default. Such hyperspheres do not have enough counts (and thus, information) for downstream analyses. Removing them reduces the amount of memory required to form the output matrix.

Value

A CyData object containing the following information:

counts

An integer matrix of counts for each hypersphere (row) and sample (column) in the assays slot.

intensities:

A numeric matrix of median intensities for each hypersphere (row) and marker (column), accessible with the intensities function.

cellAssignments:

A list of integer vectors specifying the cells contained within each hypersphere, accessible with the cellAssignments function.

totals:

An integer vector specifying the total number of cells in each sample, stored as a field in the colData.

center.cell:

An integer vector specifying the cell that is used as the centre of each hypersphere, accessible with the getCenterCell function.

Contents of prepared are also stored in the int_metadata of the output object.

Author(s)

Aaron Lun

References

Lun ATL, Richard AC, Marioni JC (2017). Testing for differential abundance in mass cytometry data. Nat. Methods, 14, 7:707-709.

Samusik N, Good Z, Spitzer MH et al. (2016). Automated mapping of phenotype space with single-cell data. Nat. Methods 13:493-496

See Also

prepareCellData, to generate the input object prepared.

Examples

example(prepareCellData, echo=FALSE)
downsample <- 10L
tol <- 0.5

cnt <- countCells(cd, filter=1, downsample=downsample, tol=tol)
cnt


MarioniLab/cydar documentation built on Sept. 7, 2024, 6:24 a.m.