align_cds: Align cells from different groups within a cds

View source: R/alignment.R

align_cdsR Documentation

Align cells from different groups within a cds

Description

Data sets that contain cells from different groups often benefit from alignment to subtract differences between them. Alignment can be used to remove batch effects, subtract the effects of treatments, or even potentially compare across species. align_cds executes alignment and stores these adjusted coordinates.

This function can be used to subtract both continuous and discrete batch effects. For continuous effects, align_cds fits a linear model to the cells' PCA or LSI coordinates and subtracts them using Limma. For discrete effects, you must provide a grouping of the cells, and then these groups are aligned using Batchelor, a "mutual nearest neighbor" algorithm described in:

Haghverdi L, Lun ATL, Morgan MD, Marioni JC (2018). "Batch effects in single-cell RNA-sequencing data are corrected by matching mutual nearest neighbors." Nat. Biotechnol., 36(5), 421-427. doi: 10.1038/nbt.4091

Usage

align_cds(
  cds,
  preprocess_method = c("PCA", "LSI"),
  alignment_group = NULL,
  alignment_k = 20,
  residual_model_formula_str = NULL,
  verbose = FALSE,
  build_nn_index = FALSE,
  nn_control = list(),
  ...
)

Arguments

cds

the cell_data_set upon which to perform this operation

preprocess_method

a string specifying the low-dimensional space in which to perform alignment, currently either PCA or LSI. Default is "PCA".

alignment_group

String specifying a column of colData to use for aligning groups of cells. The column specified must be a factor. Alignment can be used to subtract batch effects in a non-linear way. For correcting continuous effects, use residual_model_formula_str. Default is NULL.

alignment_k

The value of k used in mutual nearest neighbor alignment

residual_model_formula_str

NULL or a string model formula specifying any effects to subtract from the data before dimensionality reduction. Uses a linear model to subtract effects. For non-linear effects, use alignment_group. Default is NULL.

verbose

Whether to emit verbose output during dimensionality reduction

build_nn_index

logical When this argument is set to TRUE, align_cds builds the nearest neighbor index from the aligned reduced matrix for later use. Default is FALSE.

nn_control

An optional list of parameters used to make the nearest neighbor index. See the set_nn_control help for detailed information.

...

additional arguments to pass to limma::lmFit if residual_model_formula is not NULL

Value

an updated cell_data_set object

Examples

  
    cell_metadata <- readRDS(system.file('extdata',
                                         'worm_embryo/worm_embryo_coldata.rds',
                                         package='monocle3'))
    gene_metadata <- readRDS(system.file('extdata',
                                         'worm_embryo/worm_embryo_rowdata.rds',
                                         package='monocle3'))
    expression_matrix <- readRDS(system.file('extdata',
                                             'worm_embryo/worm_embryo_expression_matrix.rds',
                                             package='monocle3'))
   
    cds <- new_cell_data_set(expression_data=expression_matrix,
                             cell_metadata=cell_metadata,
                             gene_metadata=gene_metadata)

    cds <- preprocess_cds(cds)
    cds <- align_cds(cds, alignment_group =
                     "batch", residual_model_formula_str = "~ bg.300.loading +
                      bg.400.loading + bg.500.1.loading + bg.500.2.loading +
                      bg.r17.loading + bg.b01.loading + bg.b02.loading")
  

cole-trapnell-lab/monocle3 documentation built on April 7, 2024, 9:24 p.m.