merge_sce_list: Merge a list of SCEs into one SCE object

View source: R/merge_sce_list.R

merge_sce_listR Documentation

Merge a list of SCEs into one SCE object

Description

This function takes an optionally-named (if named, ideally by a form of library ID) list of SingleCellExperiment (SCE) objects and merges them into one SCE object. At least some genes must be present in all SCEs in order to merge them. By default, alternative experiments (altExps) are retained in the final merged object, but each altExp of a given name is required to have identical features.

Usage

merge_sce_list(
  sce_list = list(),
  batch_column = "library_id",
  cell_id_column = "cell_id",
  retain_coldata_cols = c("sum", "detected", "total", "subsets_mito_sum",
    "subsets_mito_detected", "subsets_mito_percent", "miQC_pass", "prob_compromised",
    "barcodes"),
  include_altexp = TRUE,
  preserve_rowdata_cols = NULL,
  retain_altexp_coldata_cols = NULL,
  preserve_altexp_rowdata_cols = NULL
)

Arguments

sce_list

A list of SingleCellExperiment objects. The list may optionally be named with batch information. If no names are provided, names will be generated based on the SCE's index.

batch_column

A character value giving the resulting colData column name to differentiate originating SingleCellExperiment objects. Often these values are unique library IDs. Default value is '"library_id"'.

cell_id_column

A character value giving the resulting colData column name to hold unique cell IDs formatted as their original row name. Default value is '"cell_id"'.

retain_coldata_cols

A vector of colData columns which should be retained in the the final merged SCE object. If columns are missing from any SCE to be merged, they will be created and populated with 'NA' values. A vector of default columns to retain is given in the function definition.

include_altexp

Boolean for whether altExps, if present, should be included in the final merged object. Default is 'TRUE'.

preserve_rowdata_cols

A vector of column names that appear in originating SCE objects' rowData slots which should not be renamed. These are generally columns which are not specific to the given library's preparation or statistics. For example, such a vector might contain items like "Gene", "ensembl_ids", etc. Default value is 'NULL'.

retain_altexp_coldata_cols

Named list containing vectors of column names that should be retained in altExp colData. Elements are named by the altExp for which the columns should be retained. If any given altExp name is not present in any SCE, it will be ignored. If columns are missing from any given altExp to be merged, they will be created and populated with 'NA' values. Default value is 'NULL'.

preserve_altexp_rowdata_cols

Named list containing vectors of column names that should not be renamed in altExp rowData. These are generally columns which are not specific to the given library's preparation or statistics. Default value is 'NULL'.

Details

Original SCE contents are modified or retained as follows: - The resulting colData slot will include a new column specified by 'batch_column' (default "library_id") that either holds the originating SCE object's name (referred to as 'sce_name' here), or if it is unnamed then its index in the provided 'sce_list'. - The resulting colData slot will include another new column 'cell_id_column' (default "cell_id") that will contain the SCE's original column names (i.e. original colData row names). Often, but not always, this row name holds a unique cell barcode. - Of the original colData columns, only column names provided in the argument 'retain_coldata_cols' will be retained. - The resulting colData row names will be be prefixed with '{sce_name-}'. - The resulting rowData slot column names will be appended with the given SCE's name, as '{sce_name}-{column_name}' except for columns whose names are indicated to preserve with the 'preserve_rowdata_cols' argument.

SCE altExp contents are modified or retained as follows: - As with the main experiment, the additional columns 'batch_column' and 'cell_id_column' will be added to the colData slot. - Of the original altExp colData columns, only column names provided in the argument 'retain_altexp_coldata_cols', as specified for each named altExp will be retained. - The resulting rowData slot column names will be appended with the given SCE's name, as '{sce_name}-{column_name}' except for columns whose names are indicated to preserve with the 'preserve_altexp_rowdata_cols' argument.

Value

A SingleCellExperiment object containing all SingleCellExperiment objects present in the inputted list

Examples

## Not run: 
# Merge list of SCEs, specifying a different batch column name
merge_sce_list(
  sce_list = list("sce1" = sce1, "sce2" = sce2),
  batch_column = "batch"
)


#' # Merge list of SCEs but do include any alternative experiments in the merged object
merge_sce_list(
  sce_list = list("sce1" = sce1, "sce2" = sce2),
  include_altexp = FALSE
)


# Merge list of SCEs and include alternative experiments in the merged object.
# The provided list of SCEs may contain alternative experiments named `"adt"`
#  and/or `"other_altexp"` which, if present, are expected to respectively
#  have the colData columns given in the `retain_altexp_coldata_cols` and the
#  rowData columns given in `preserve_altexp_rowdata_cols`
merge_sce_list(
  sce_list = list("sce1" = sce1, "sce2" = sce2),
  # columns to retain in the given alternative experiment, if it is present
  retain_altexp_coldata_cols = list(
    "adt" = c("discard", "high.controls"),
    "other_altexp" = c("first_column", "second_column")
  ),
  preserve_altexp_rowdata_cols = list(
    "adt" = "target_type"
  )
)

## End(Not run)


AlexsLemonade/scpcaTools documentation built on July 12, 2024, 8:34 a.m.