multiModalMNN: Multi-modal batch correction with MNNs
In LTLA/mumosa: Multi-Modal Single-Cell Analysis Methods

View source: R/multiModalMNN.R

multiModalMNN

R Documentation

Multi-modal batch correction with MNNs

Description

Perform MNN correction on multi-modal data, based on a generalization of fastMNN to multiple feature sets.

Usage

multiModalMNN(
  ...,
  batch = NULL,
  which = NULL,
  rescale.k = 50,
  common.args = list(),
  main.args = list(),
  alt.args = list(),
  mnn.args = list(),
  BNPARAM = KmknnParam(),
  BPPARAM = SerialParam()
)

Arguments

`...`	One or more SingleCellExperiment objects, containing a shared set of alternative Experiments corresponding to different data modalities. Alternatively, one or more lists of such objects.
`batch`	Factor specifying the batch to which each cell belongs, when `...` contains only one SingleCellExperiment object. Otherwise, each object in `...` is assumed to contain cells from a single batch.
`which`	Character vector containing the names of the alternative Experiments to use for correction. Defaults to the names of all alternative Experiments that are present in every object of `...`.
`rescale.k`	Integer scalar specifying the number of neighbors to use in `rescaleByNeighbors`.
`common.args`	Named list of further arguments to control the PCA for all modalities.
`main.args`	Named list of further arguments to control the PCA for the main Experiments. Overrides any arguments of the same name in `common.args`.
`alt.args`	Named list of named lists of further arguments to control the PCA for each alternative Experiment specified in `which`. This should be a list where each entry is named after any alternative Experiment and contains an internal list of named arguments; these override any settings in `common.args` in the PCA for the corresponding modality.
`mnn.args`	Further arguments to pass to `reducedMNN`, controlling the MNN correction.
`BNPARAM`	A BiocNeighborParam object specifying how the nearest neighbor searches should be performed.
`BPPARAM`	A BiocParallelParam object specifying how parallelization should be performed.

Details

This function implements a multi-modal MNN correction for SingleCellExperiment inputs where each main and alternative Experiment corresponds to one modality. We perform a PCA within each modality with multiBatchPCA, rescale the PCs to be of a comparable scale with rescaleByNeighbors, and finally correct in low-dimensional space with reducedMNN. Corrected expression values for each modality are then recovered in the same manner as described for fastMNN.

Modality-specific arguments can be passed to the PCA step via the common.args, main.args and alt.args arguments. These mirror the corresponding arguments in applyMultiSCE - see the documentation for that function for more details. Additional arguments for the MNN step can be passed via mnn.args. Note that batch is used across all steps and must be specified as its own argument in the multiModalMNN function signature.

Most arguments in multiBatchPCA can be specified in common.args, main.args or each entry of alt.args. This includes passing d=NA to turn off the PCA or subset.row to only use a subset of features for the PCA. Additionally, the following arguments are supported:

By default, a cosine-normalization is performed prior to the PCA for each modality. This can be disabled by passing cos.norm=FALSE to common.args, main.args or each entry of alt.args.
Setting correct.all will reported corrected expression values for all features even when subset.row is specified. This can be used in common.args, main.args or each entry of alt.args.

Note that the function will look for assay.type="logcounts" by default in each entry of .... Users should perform log-normalization prior to calling multiModalMNN, most typically with multiBatchNorm - see Examples.

Value

A SingleCellExperiment of the same structure as that returned by fastMNN, i.e., with a corrected entry of corrected low-dimensional coordinates and a reconstructed assay of corrected expression values. In addition, the altExps entries contain corrected values for each data modality used in the correction.

Author(s)

Aaron Lun

Examples

# Mocking up a gene expression + ADT dataset:
library(scater)
exprs_sce <- mockSCE()
adt_sce <- mockSCE(ngenes=20) 
altExp(exprs_sce, "ADT") <- adt_sce

# Pretend we have three batches for the sake of demonstration:
batch <- sample(1:3, ncol(exprs_sce), replace=TRUE)

# Normalizing first with batchelor::multiBatchNorm:
library(batchelor)
exprs_sce <- applyMultiSCE(exprs_sce, batch=batch, FUN=multiBatchNorm)

# and perform batch correction:
corrected <- multiModalMNN(exprs_sce, batch=batch, which="ADT")

# Pass arguments to, e.g., use a subset of features for the RNA,
# turn off the PCA for the ADTs:
corrected2 <- multiModalMNN(exprs_sce, batch=batch, which="ADT",
    main.args=list(subset.row=1:500), 
    alt.args=list(ADT=list(d=NA)))

LTLA/mumosa documentation built on Oct. 1, 2024, 8:47 a.m.