registration_wrapper: Spatial registration: wrapper function

View source: R/registration_wrapper.R

registration_wrapperR Documentation

Spatial registration: wrapper function

Description

This function is provided for convenience. It runs all the functions required for computing the modeling_results. This can be useful for finding marker genes on a new spatially-resolved transcriptomics dataset and thus using it for run_app(). The results from this function can also be used for performing spatial registration through layer_stat_cor() and related functions of sc/snRNA-seq datasets.

Usage

registration_wrapper(
  sce,
  var_registration,
  var_sample_id,
  covars = NULL,
  gene_ensembl = NULL,
  gene_name = NULL,
  suffix = "",
  min_ncells = 10,
  pseudobulk_rds_file = NULL
)

Arguments

sce

A SingleCellExperiment-class object or one that inherits its properties.

var_registration

A character(1) specifying the colData(sce) variable of interest against which will be used for computing the relevant statistics. This should be a categorical variable, with all categories syntaticly valid (could be used as an R variable, no special characters or leading numbers), ex. 'L1.2', 'celltype2' not 'L1/2' or '2'.

var_sample_id

A character(1) specifying the colData(sce) variable with the sample ID.

covars

A character() with names of sample-level covariates.

gene_ensembl

A character(1) specifying the rowData(sce_pseudo) column with the ENSEMBL gene IDs. This will be used by layer_stat_cor().

gene_name

A character(1) specifying the rowData(sce_pseudo) column with the gene names (symbols).

suffix

A character(1) specifying the suffix to use for the F-statistics column. This is particularly useful if you will run this function more than once and want to be able to merge the results.

min_ncells

An integer(1) greater than 0 specifying the minimum number of cells (for scRNA-seq) or spots (for spatial) that are combined when pseudo-bulking. Pseudo-bulked samples with less than min_ncells on sce_pseudo$ncells will be dropped.

pseudobulk_rds_file

A character(1) specifying the path for saving an RDS file with the pseudo-bulked object. It's useful to specify this since pseudo-bulking can take hours to run on large datasets.

Details

We chose a default of min_ncells = 10 based on OSCA from section 4.3 at http://bioconductor.org/books/3.15/OSCA.multisample/multi-sample-comparisons.html. They cite https://doi.org/10.1038/s41467-020-19894-4 as the paper where they came up with the definition of "very low" being 10. You might want to use registration_pseudobulk() and manually explore sce_pseudo$ncells to choose the best cutoff.

Value

A list() of data.frame() with the statistical results. This is similar to fetch_data("modeling_results").

See Also

Other spatial registration and statistical modeling functions: registration_block_cor(), registration_model(), registration_pseudobulk(), registration_stats_anova(), registration_stats_enrichment(), registration_stats_pairwise()

Examples

## Ensure reproducibility of example data
set.seed(20220907)

## Generate example data
sce <- scuttle::mockSCE()

## Add some sample IDs
sce$sample_id <- sample(LETTERS[1:5], ncol(sce), replace = TRUE)

## Add a sample-level covariate: age
ages <- rnorm(5, mean = 20, sd = 4)
names(ages) <- LETTERS[1:5]
sce$age <- ages[sce$sample_id]

## Add gene-level information
rowData(sce)$ensembl <- paste0("ENSG", seq_len(nrow(sce)))
rowData(sce)$gene_name <- paste0("gene", seq_len(nrow(sce)))

## Compute all modeling results
example_modeling_results <- registration_wrapper(
    sce,
    var_registration = "Cell_Cycle",
    var_sample_id = "sample_id",
    covars = c("age"),
    gene_ensembl = "ensembl",
    gene_name = "gene_name",
    suffix = "wrapper"
)

## Explore the results from registration_wrapper()
class(example_modeling_results)
length(example_modeling_results)
names(example_modeling_results)
lapply(example_modeling_results, head)

LieberInstitute/spatialLIBD documentation built on Dec. 19, 2024, 7:12 p.m.