markerEnrichment: Find enriched markers per identified cluster and calculate...

Description Usage Arguments Details Value Author(s) Examples

View source: R/markerEnrichment.R

Description

Find enriched markers per identified cluster and calculate cluster abundances across these for samples and metadata variables.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
markerEnrichment(
  indata,
  meta = NULL,
  assay = "scaled",
  sampleAbundances = TRUE,
  sampleID = "sample",
  studyvarID = NULL,
  clusterAssign = metadata(indata)[["Cluster"]],
  funcSummarise = function(x) mean(x, na.rm = TRUE),
  method = "Z",
  prob = 0.1,
  limits = c(-1.96, 1.96),
  verbose = TRUE
)

Arguments

indata

A data-frame or matrix, or SingleCellExperiment object. If a data-frame or matrix, this should relate to expression data (cells as columns; genes as rows). If a SingleCellExperiment object, data will be extracted from an assay component named by assay.

meta

If 'indata' is a non-SingleCellExperiment object, meta must be activated and relate to a data-frame of metadata that aligns with the columns of indata, and that also contains a column name specified by studyvarID.

assay

Name of the assay slot in indata from which data will be taken, assuming indata is a SingleCellExperiment object.

sampleAbundances

Logical, indicating whether or not to calculate cluster abundances across study samples.

sampleID

If sampleAbundances == TRUE, a column name from the provided metadata representing over which sample cluster abundances will be calculated.

studyvarID

A column name from the provided metadata representing a condition or trait over which cluster abundances will be calculated.

clusterAssign

A vector of cell-to-cluster assignments. This can be from any source but must align with your cells / variables. There is no check to ensure this when 'indata' is not a SingleCellExperiment object.

funcSummarise

A mathematical function used to summarise expression per marker per cluster.

method

Type of summarisation to apply to the data for final marker selection. Possible values include Z or quantile. If Z, limits relate to lower and upper Z-score cut-offs for low|high markers. The defaults of -1.96 and +1.96 are equivalents of p<0.05 on a two-tailed distribution. If quantile, prob will be used to define the nth lower and 1 - nth upper quantiles, which will be used for selecting low|high markers.

prob

See details for method.

limits

See details for method.

verbose

Boolean (TRUE / FALSE) to print messages to console or not.

Details

Find enriched markers per identified cluster and calculate cluster abundances across these for samples and metadata variables. markerEnrichment first collapses your input data's expression profiles from the level of cells to the level of clusters based on a mathematical function specified by funcSummarise. It then either selects, per cluster, low|high markers via quantiles, or transforms this collapsed data to global Z-scores and selects low|high markers based on Z-score cut-offs.

Value

A data.frame object.

Author(s)

Kevin Blighe <kevin@clinicalbioinformatics.co.uk>

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
# create random data that follows a negative binomial
mat <- jitter(matrix(
  MASS::rnegbin(rexp(1000, rate=.1), theta = 4.5),
  ncol = 20))
colnames(mat) <- paste0('CD', 1:ncol(mat))
rownames(mat) <- paste0('cell', 1:nrow(mat))

u <- umap::umap(mat)$layout
colnames(u) <- c('UMAP1','UMAP2')
rownames(u) <- rownames(mat)
clus <- clusKNN(u)

metadata <- data.frame(
  group = c(rep('PB1', 25), rep('PB2', 25)),
  row.names = rownames(u))

markerEnrichment(t(mat), meta = metadata,
  sampleAbundances = FALSE,
  studyvarID = 'group', clusterAssign = clus)

kevinblighe/scToolkit documentation built on Sept. 25, 2021, 11:29 p.m.