dataset_merge: Merge multiple SummarizedExperiment datasets into one
In omnideconv/SimBu: Simulate Bulk RNA-seq Datasets from Single-Cell Datasets

dataset_merge

R Documentation

Merge multiple SummarizedExperiment datasets into one

Description

The objects need to have the same number of assays in order to work.

Usage

dataset_merge(
  dataset_list,
  name = "SimBu_dataset",
  spike_in_col = NULL,
  additional_cols = NULL,
  filter_genes = TRUE,
  variance_cutoff = 0,
  type_abundance_cutoff = 0,
  scale_tpm = TRUE
)

Arguments

`dataset_list`	(mandatory) list of SummarizedExperiment objects
`name`	name of the new dataset
`spike_in_col`	which column in annotation contains information on spike_in counts, which can be used to re-scale counts; mandatory for spike_in scaling factor in simulation
`additional_cols`	list of column names in annotation, that should be stored as well in dataset object
`filter_genes`	boolean, if TRUE, removes all genes with 0 expression over all samples & genes with variance below `variance_cutoff`
`variance_cutoff`	numeric, is only applied if `filter_genes` is TRUE: removes all genes with variance below the chosen cutoff
`type_abundance_cutoff`	numeric, remove all cells, whose cell-type appears less then the given value. This removes low abundant cell-types
`scale_tpm`	boolean, if TRUE (default) the cells in tpm_matrix will be scaled to sum up to 1e6

Value

SummarizedExperiment object

Examples


counts <- Matrix::Matrix(matrix(stats::rpois(3e5, 5), ncol = 300), sparse = TRUE)
tpm <- Matrix::Matrix(matrix(stats::rpois(3e5, 5), ncol = 300), sparse = TRUE)
tpm <- Matrix::t(1e6 * Matrix::t(tpm) / Matrix::colSums(tpm))

colnames(counts) <- paste0("cell_", rep(1:300))
colnames(tpm) <- paste0("cell_", rep(1:300))
rownames(counts) <- paste0("gene_", rep(1:1000))
rownames(tpm) <- paste0("gene_", rep(1:1000))

annotation <- data.frame(
  "ID" = paste0("cell_", rep(1:300)),
  "cell_type" = c(rep("T cells CD4", 300))
)

ds1 <- SimBu::dataset(annotation = annotation, count_matrix = counts, tpm_matrix = tpm, name = "test_dataset1")
ds2 <- SimBu::dataset(annotation = annotation, count_matrix = counts, tpm_matrix = tpm, name = "test_dataset2")
ds_merged <- SimBu::dataset_merge(list(ds1, ds2))

omnideconv/SimBu documentation built on March 29, 2025, 10:49 p.m.