controlled_geneset_enrichment: Celltype controlled geneset enrichment
In NathanSkene/EWCE: Expression Weighted Celltype Enrichment

View source: R/controlled_geneset_enrichment.r

controlled_geneset_enrichment

R Documentation

Celltype controlled geneset enrichment

Description

controlled_geneset_enrichment tests whether a functional gene set is still enriched in a disease gene set after controlling for the disease gene set's enrichment in a particular cell type (the 'controlledCT')

Usage

controlled_geneset_enrichment(
  disease_genes,
  functional_genes,
  bg = NULL,
  sct_data,
  sctSpecies = NULL,
  output_species = "human",
  disease_genes_species = NULL,
  functional_genes_species = NULL,
  method = "homologene",
  annotLevel,
  reps = 100,
  controlledCT,
  use_intersect = FALSE,
  verbose = TRUE
)

Arguments

`disease_genes`	Array of gene symbols containing the disease gene list. Does not have to be disease genes. Must be from same species as the single cell transcriptome dataset.
`functional_genes`	Array of gene symbols containing the functional gene list. The enrichment of this gene set within the disease_genes is tested. Must be from same species as the single cell transcriptome dataset.
`bg`	List of gene symbols containing the background gene list (including hit genes). If `bg=NULL`, an appropriate gene background will be created automatically.
`sct_data`	List generated using generate_celltype_data.
`sctSpecies`	Species that `sct_data` is currently formatted as (no longer limited to just "mouse" and "human"). See list_species for all available species.
`output_species`	Species to convert `sct_data` and `hits` to (Default: "human"). See list_species for all available species.
`disease_genes_species`	Species of the `disease_genes` gene set.
`functional_genes_species`	Species of the `functional_genes` gene set.
`method`	R package to use for gene mapping: `"gprofiler"` : Slower but more species and genes. `"homologene"` : Faster but fewer species and genes. `"babelgene"` : Faster but fewer species and genes. Also gives consensus scores for each gene mapping based on a several different data sources.
`annotLevel`	An integer indicating which level of `sct_data` to analyse (Default: 1).
`reps`	Number of random gene lists to generate (Default: 100, but should be >=10,000 for publication-quality results).
`controlledCT`	[Optional] If not NULL, and instead is the name of a cell type, then the bootstrapping controls for expression within that cell type.
`use_intersect`	When `species1` and `species2` are both different from `output_species`, this argument will determine whether to use the intersect (`TRUE`) or union (`FALSE`) of all genes from `species1` and `species2`.
`verbose`	Print messages.

Value

A list containing three data frames:

p_controlled The probability that functional_genes are enriched in disease_genes while controlling for the level of specificity in controlledCT
z_controlled The z-score that functional_genes are enriched in disease_genes while controlling for the level of specificity in controlledCT
p_uncontrolled The probability that functional_genes are enriched in disease_genes WITHOUT controlling for the level of specificity in controlledCT
z_uncontrolled The z-score that functional_genes are enriched in disease_genes WITHOUT controlling for the level of specificity in controlledCT
reps=reps
controlledCT
actualOverlap=actual The number of genes that overlap between functional and disease gene sets

Examples

# See the vignette for more detailed explanations
# Gene set enrichment analysis controlling for cell type expression
# set seed for bootstrap reproducibility
set.seed(12345678)
## load merged dataset from vignette
ctd <- ewceData::ctd()
schiz_genes <- ewceData::schiz_genes()
hpsd_genes <- ewceData::hpsd_genes()
# Use 3 bootstrap lists for speed, for publishable analysis use >10000
reps <- 3

res_hpsd_schiz <- EWCE::controlled_geneset_enrichment(
    disease_genes = schiz_genes,
    functional_genes = hpsd_genes,
    sct_data = ctd,
    annotLevel = 1,
    reps = reps,
    controlledCT = "pyramidal CA1"
)

NathanSkene/EWCE documentation built on Feb. 17, 2025, 7:52 a.m.