medIntensities: Compute median marker intensities
In MarioniLab/cydar: Using Mass Cytometry for Differential Abundance Analyses

medIntensities

R Documentation

Compute median marker intensities

Description

Calcalute the median intensity across cells in each group and sample for the specified markers.

Usage

medIntensities(x, markers)

Arguments

`x`	A CyData object where each row corresponds to a group of cells, such as that produced by `countCells`.
`markers`	A vector specifying the markers for which median intensities should be calculated.

Details

For each group of cells, the median intensity across all assigned cells in each sample is computed. This is returned as a matrix of median intensities, with one value per sample (column) and hypersphere (row). If a sample has no cells in a group, the corresponding entry of the matrix will be set to NA.

The groups in x should be defined using a different set of markers than in markers. Specifically, the markers used in prepareCellData should not be the same as the markers in this function. If the same markers are used for both functions, then a shift is unlikely to be observed. This is because, by definition, the groups will contain cells with similar intensities for those markers.

The idea is to use the median intensities for weighted linear regression to identify a shift in intensity within each hypersphere. The weight for each group/sample is defined as the number of cells, i.e., the "counts" assay in x. This accounts for the precision with which the median is estimated, under certain assumptions. See the Examples for how this data can be prepared for entry into analysis packages like limma.

The median intensity is used rather than the mean to ensure that shifts are interpreted correctly. For example, mean shifts can be driven by strong changes in a subset of cells that are not representative of the majority of cells in the group. This could lead to misinterpretation of the nature of the shift with respect to the group's overall identity.

Value

A CyData object is returned equivalent to x, but with numeric matrices of sample-specific median intensities as additional elements of the assays slot.

Choosing between counting strategies

In situations where markers can be separated into two sets (e.g., cell type and signalling markers), there are two options for analysis. The first is to define groups based on the “primary” set of markers, then use medIntensities to identify shifts in each group for each of the “secondary” markers. This is the best approach for detecting increases or decreases in marker intensity that affect a majority of cells in each group.

The second approach is to use all markers in prepareCellData and count cells accordingly in countCells. This provides more power to detect changes in marker intensity that only affect a subset of cells in each group. It is also more useful if one is interested in identifying cells with concomitant changes in multiple secondary markers. However, this tends to be less effective for studying changes in a specific marker, due to the loss of precision with increased dimensionality.

Author(s)

Aaron Lun

Examples

### Mocking up some data: ###
nmarkers <- 21
marker.names <- paste0("X", seq_len(nmarkers))
nsamples <- 5
sample.names <- paste0("Y", seq_len(nsamples))

x <- list()
for (i in sample.names) {
    ex <- matrix(rgamma(nmarkers*1000, 2, 2), ncol=nmarkers, nrow=1000)
    colnames(ex) <- marker.names
    x[[i]] <- ex
}

### Processing it beforehand with one set of markers: ###
cd <- prepareCellData(x, markers=marker.names[1:10])
cnt <- countCells(cd, filter=5)

## Computing the median intensity for one marker: ###
cnt2 <- medIntensities(cnt, markers=marker.names[21])
library(limma)
median.int.21 <- assay(cnt2, "med.X21") 
cell.count <- assay(cnt2, "counts") 
el <- new("EList", list(E=median.int.21, weights=cell.count))

MarioniLab/cydar documentation built on Sept. 7, 2024, 6:24 a.m.