Bioconductor package to visualize spatially-resolved transcriptomics data

gene_set_enrichment

R Documentation

Evaluate the enrichment for a list of gene sets

Description

Using the layer-level (group-level) data, this function evaluates whether list of gene sets (Ensembl gene IDs) are enriched among the significant genes (FDR < 0.1 by default) genes for a given model type result. Test the alternative hypothesis that OR > 1, i.e. that gene set is over-represented in the set of enriched genes. If you want to check depleted genes, change reverse to TRUE.

Usage

gene_set_enrichment(
  gene_list,
  fdr_cut = 0.1,
  modeling_results = fetch_data(type = "modeling_results"),
  model_type = names(modeling_results)[1],
  reverse = FALSE
)

Arguments

`gene_list`	A named `list` object (could be a `data.frame`) where each element of the list is a character vector of Ensembl gene IDs.
`fdr_cut`	A `numeric(1)` specifying the FDR cutoff to use for determining significance among the modeling results genes.
`modeling_results`	Defaults to the output of `fetch_data(type = 'modeling_results')`. This is a list of tables with the columns `⁠f_stat_⁠` or `⁠t_stat_⁠` as well as `⁠p_value_⁠` and `⁠fdr_⁠` plus `ensembl`. The column name is used to extract the statistic results, the p-values, and the FDR adjusted p-values. Then the `ensembl` column is used for matching in some cases. See `fetch_data()` for more details. Typically this is the set of reference statistics used in `layer_stat_cor()`.
`model_type`	A named element of the `modeling_results` list. By default that is either `enrichment` for the model that tests one human brain layer against the rest (one group vs the rest), `pairwise` which compares two layers (groups) denoted by `layerA-layerB` such that `layerA` is greater than `layerB`, and `anova` which determines if any layer (group) is different from the rest adjusting for the mean expression level. The statistics for `enrichment` and `pairwise` are t-statistics while the `anova` model ones are F-statistics.
`reverse`	A `logical(1)` indicating whether to multiply by `-1` the input statistics and reverse the `layerA-layerB` column names (using the `-`) into `layerB-layerA`.

Details

Check https://github.com/LieberInstitute/HumanPilot/blob/master/Analysis/Layer_Guesses/check_clinical_gene_sets.R to see a full script from where this family of functions is derived from.

Value

A table in long format with the enrichment results using stats::fisher.test().

OR odds ratio.
Pval p-value for fisher.test().
test group or layer in the modeling_results.
NumSig Number of genes from the gene set present in modeling_results & with fdr < fdr_cut and t_stat > 0 (unless reverse = TRUE) for test in modeling results.
SetSize Number of genes from modeling_results present in gene_set.
ID name of gene set.
model_type record of input model type from ⁠modeling results⁠.
fdr_cut record of input frd_cut.

Author(s)

Andrew E Jaffe, Leonardo Collado-Torres

Examples


## Read in the SFARI gene sets included in the package
asd_sfari <- utils::read.csv(
    system.file(
        "extdata",
        "SFARI-Gene_genes_01-03-2020release_02-04-2020export.csv",
        package = "spatialLIBD"
    ),
    as.is = TRUE
)

## Format them appropriately
asd_sfari_geneList <- list(
    Gene_SFARI_all = asd_sfari$ensembl.id,
    Gene_SFARI_high = asd_sfari$ensembl.id[asd_sfari$gene.score < 3],
    Gene_SFARI_syndromic = asd_sfari$ensembl.id[asd_sfari$syndromic == 1]
)

## Obtain the necessary data
if (!exists("modeling_results")) {
    modeling_results <- fetch_data(type = "modeling_results")
}

## Compute the gene set enrichment results
asd_sfari_enrichment <- gene_set_enrichment(
    gene_list = asd_sfari_geneList,
    modeling_results = modeling_results,
    model_type = "enrichment"
)

## Explore the results
asd_sfari_enrichment

LieberInstitute/spatialLIBD documentation built on April 14, 2025, 5:19 a.m.