StabMap: Stabilised mosaic single cell data integration using unshared...

stabMapR Documentation

Stabilised mosaic single cell data integration using unshared features

Description

stabMap performs mosaic data integration by first building a mosaic data topology, and for each reference dataset, traverses the topology to project and predict data onto a common principal component (PC) or linear discriminant (LD) embedding.

Usage

stabMap(
  assay_list,
  labels_list = NULL,
  reference_list = sapply(names(assay_list), function(x) TRUE, simplify = FALSE),
  reference_features_list = lapply(assay_list, rownames),
  reference_scores_list = NULL,
  ncomponentsReference = 50,
  ncomponentsSubset = 50,
  suppressMessages = TRUE,
  projectAll = FALSE,
  restrictFeatures = FALSE,
  maxFeatures = 1000,
  plot = TRUE,
  scale.center = TRUE,
  scale.scale = TRUE
)

Arguments

assay_list

A list of data matrices with rownames (features) specified.

labels_list

(optional) named list containing cell labels

reference_list

Named list containing logical values whether the data matrix should be considered as a reference dataset, alternatively a character vector containing the names of the reference data matrices.

reference_features_list

List of features to consider as reference data (default is all available features).

reference_scores_list

Named list of reference scores (default NULL). If provided, matrix of cells (rows with rownames given) and dimensions (columns with colnames given) are used as the reference low-dimensional embedding to target, as opposed to performing PCA or LDA on the input reference data.

ncomponentsReference

Number of principal components for embedding reference data, given either as an integer or a named list for each reference dataset.

ncomponentsSubset

Number of principal components for embedding query data prior to projecting to the reference, given either as an integer or a named list for each reference dataset.

suppressMessages

Logical whether to suppress messages (default TRUE).

projectAll

Logical whether to re-project reference data along with query (default FALSE).

restrictFeatures

logical whether to restrict to features used in dimensionality reduction of reference data (default FALSE). Overall it's recommended that this be FALSE for single-hop integrations and TRUE for multi-hop integrations.

maxFeatures

Maximum number of features to consider for predicting principal component scores (default 1000).

plot

Logical whether to plot mosaic data UpSet plot and mosaic data topology networks (default TRUE).

scale.center

Logical whether to re-center data to a mean of 0 (default FALSE).

scale.scale

Logical whether to re-scale data to standard deviation of 1 (default FALSE).

Value

matrix containing common embedding with rows corresponding to cells, and columns corresponding to PCs or LDs for reference dataset(s).

Examples

set.seed(2021)
assay_list = mockMosaicData()
lapply(assay_list, dim)

# specify which datasets to use as reference coordinates
reference_list = c("D1", "D3")

# specify some sample labels to distinguish using linear discriminant
# analysis (LDA)
labels_list = list(
D1 = rep(letters[1:5], length.out = ncol(assay_list[["D1"]]))
)

# examine the topology of this mosaic data integration
mosaicDataUpSet(assay_list)
plot(mosaicDataTopology(assay_list))

# stabMap
out = stabMap(assay_list,
              reference_list = reference_list,
              labels_list = labels_list,
              ncomponentsReference = 20,
              ncomponentsSubset = 20)

head(out)


MarioniLab/StabMap documentation built on Sept. 28, 2022, 2:28 a.m.