registration_pseudobulk: Spatial registration: pseudobulk

View source: R/registration_pseudobulk.R

registration_pseudobulkR Documentation

Spatial registration: pseudobulk

Description

Pseudo-bulk the gene expression, filter lowly-expressed genes, and normalize. This is the first step for spatial registration and for statistical modeling.

Usage

registration_pseudobulk(
  sce,
  var_registration,
  var_sample_id,
  covars = NULL,
  min_ncells = 10,
  pseudobulk_rds_file = NULL
)

Arguments

sce

A SingleCellExperiment-class object or one that inherits its properties.

var_registration

A character(1) specifying the colData(sce) variable of interest against which will be used for computing the relevant statistics. This should be a categorical variable, with all categories syntaticly valid (could be used as an R variable, no special characters or leading numbers), ex. 'L1.2', 'celltype2' not 'L1/2' or '2'.

var_sample_id

A character(1) specifying the colData(sce) variable with the sample ID.

covars

A character() with names of sample-level covariates.

min_ncells

An integer(1) greater than 0 specifying the minimum number of cells (for scRNA-seq) or spots (for spatial) that are combined when pseudo-bulking. Pseudo-bulked samples with less than min_ncells on sce_pseudo$ncells will be dropped.

pseudobulk_rds_file

A character(1) specifying the path for saving an RDS file with the pseudo-bulked object. It's useful to specify this since pseudo-bulking can take hours to run on large datasets.

Value

A pseudo-bulked SingleCellExperiment-class object.

See Also

Other spatial registration and statistical modeling functions: registration_block_cor(), registration_model(), registration_stats_anova(), registration_stats_enrichment(), registration_stats_pairwise(), registration_wrapper()

Examples

## Ensure reproducibility of example data
set.seed(20220907)

## Generate example data
sce <- scuttle::mockSCE()

## Add some sample IDs
sce$sample_id <- sample(LETTERS[1:5], ncol(sce), replace = TRUE)

## Add a sample-level covariate: age
ages <- rnorm(5, mean = 20, sd = 4)
names(ages) <- LETTERS[1:5]
sce$age <- ages[sce$sample_id]

## Add gene-level information
rowData(sce)$ensembl <- paste0("ENSG", seq_len(nrow(sce)))
rowData(sce)$gene_name <- paste0("gene", seq_len(nrow(sce)))

## Pseudo-bulk
sce_pseudo <- registration_pseudobulk(sce, "Cell_Cycle", "sample_id", c("age"), min_ncells = NULL)
colData(sce_pseudo)

LieberInstitute/spatialLIBD documentation built on Dec. 19, 2024, 7:12 p.m.