se_rbind: Combine SummarizedExperiment objects by row
In jmw86069/jamses: Jam SummarizedExperiment Stats

se_rbind

R Documentation

Combine SummarizedExperiment objects by row

Description

Combine SummarizedExperiment objects by row, using rbind() logic.

Usage

se_rbind(
  se_list,
  colnames_from = "_(n|p|neg|pos)_",
  colnames_to = "_X_",
  colnames_keep = NULL,
  colData_action = c("identical", "all"),
  colData_sep = ";",
  verbose = FALSE,
  ...
)

Arguments

`se_list`	`list` of `SummarizedExperiment` objects.
`colnames_from`	`character` vector of patterns used with `gsub()` to convert `colnames()` for each object in `se_list` to an identifier that will be shared across all objects in `se_list`.
`colnames_to`	`character` vector of replacements used with `gsub()` alongside each entry in `colnames_from` to convert `colnames()` for each object in `se_list` to an identifier that will be shared across all objects in `se_list`.
`colData_action`	`character` string indicating the action used to combine `colData()` across `se_list`: `"identical"`: retain only those columns in `colData()` which are identical in all `se_list` objects. `"all"`: retain all columns, but convert columns with mismatched values to store comma-delimited values.
`colData_sep`	`character` string used as delimiter when `colData_action="all"` and when values in a column in `colData()` differs across objects in `se_list`. Only values that differ are delimited, to minimize redundancy.
`...`	additional arguments are ignored.

Details

This function is intended to help the process of calling SummarizedExperiment::rbind().

The process:

Convert colnames() for each entry in se_list using colnames_from and colnames_to. This step is useful when each object in se_list may be using a different set of colnames(). For example "sample_p_12" and "sample_n_12" might be equivalent, so renaming them with colnames_from=c("_[np]_") and colnames_to=c("_X_") would convert both values to "sample_X_12".
Subset each object in se_list using only shared colnames().
Determine how to handle colData() columns that are not identical:
- colData_action="identical": will only keep columns whose values are identical across all objects in se_list.
- colData_action="all": will keep columns in colData(), however non-identical columns will be converted to character and values will be comma-delimited.
Perform rbind().

TODO:

Write equivalent se_cbind() - it will wait until there is a driving use case.
Consider retaining only shared assayNames() across se_list.
Consider optionally retaining user-defined assayNames(). (Alternatively, the user can subset the assayNames upfront, though it might be tedious). The recommended pattern in that case:

se <- se_rbind(
   se_list=lapply(se_list, function(se){
      assays(se) <- assays(se)[assay_names];
      return(se)
   })
)

Value

SummarizedExperiment object whose colData() has been processed according to colData_action - either keeping only columns with identical values, or keeping all values delimited as a character string when values differ.

Examples

m1 <- matrix(rnorm(100), ncol=10);
colnames(m1) <- paste0("sample_p_", 1:10);
rownames(m1) <- paste0("row_", 1:10);
m2 <- matrix(rnorm(100), ncol=10);
colnames(m2) <- paste0("sample_n_", 1:10);
rownames(m2) <- paste0("row_", 11:20);
sample_id <- gsub("_[np]_", "_X_", colnames(m1));
m1
m2
se1 <- SummarizedExperiment::SummarizedExperiment(
   assays=list(counts=m1),
   rowData=data.frame(measurement=rownames(m1)),
   colData=data.frame(sample=colnames(m1),
      sample_id=sample_id))
se2 <- SummarizedExperiment::SummarizedExperiment(
   assays=list(counts=m2),
   rowData=data.frame(measurement=rownames(m2)),
   colData=data.frame(sample=colnames(m2),
      sample_id=sample_id))
# this step fails because colnames are not shared
# do.call(SummarizedExperiment::rbind, list(se1, se2))

# keep only identical colData columns
se12 <- se_rbind(list(se1, se2))
SummarizedExperiment::colData(se12)

# keep all colData columns
se12all <- se_rbind(list(se1, se2),
   colData_action="all")
SummarizedExperiment::colData(se12all)

jmw86069/jamses documentation built on Nov. 4, 2024, 9:25 p.m.