test_gene_overrepresentation-methods: analyse gene over-representation with GSEA
In tidybulk: Brings transcriptomics to the tidyverse

Description Usage Arguments Details Value Examples

test_gene_overrepresentation(
  .data,
  .sample = NULL,
  .entrez,
  .do_test,
  species,
  gene_set = NULL
)

## S4 method for signature 'spec_tbl_df'
test_gene_overrepresentation(
  .data,
  .sample = NULL,
  .entrez,
  .do_test,
  species,
  gene_set = NULL
)

## S4 method for signature 'tbl_df'
test_gene_overrepresentation(
  .data,
  .sample = NULL,
  .entrez,
  .do_test,
  species,
  gene_set = NULL
)

## S4 method for signature 'tidybulk'
test_gene_overrepresentation(
  .data,
  .sample = NULL,
  .entrez,
  .do_test,
  species,
  gene_set = NULL
)

`.data`	A 'tbl' formatted as \| <SAMPLE> \| <TRANSCRIPT> \| <COUNT> \| <...> \|
`.sample`	The name of the sample column
`.entrez`	The ENTREZ ID of the transcripts/genes
`.do_test`	A boolean column name symbol. It indicates the transcript to check
`species`	A character. For example, human or mouse. MSigDB uses the latin species names (e.g., \"Mus musculus\", \"Homo sapiens\")
`gene_set`	A character vector. The subset of MSigDB datasets you want to test against (e.g. \"C2\"). If NULL all gene sets are used (suggested). This argument was added to avoid time overflow of the examples.

\lifecycle

maturing

This wrapper execute gene enrichment analyses of the dataset using a list of transcripts and GSEA. This wrapper uses clusterProfiler (DOI: doi.org/10.1089/omi.2011.0118) on the back-end.

Undelying method: msigdbr::msigdbr(species = species) nest(data = -gs_cat) mutate(test = map( data, ~ clusterProfiler::enricher( my_entrez_rank, TERM2GENE=.x pvalueCutoff = 1 ) ))

A 'tbl' object

df_entrez = symbol_to_entrez(tidybulk::counts_mini, .transcript = transcript, .sample = sample)
df_entrez = aggregate_duplicates(df_entrez, aggregation_function = sum, .sample = sample, .transcript = entrez, .abundance = count)
df_entrez = mutate(df_entrez, do_test = transcript %in% c("TNFRSF4", "PLCH2", "PADI4", "PAX7"))

	test_gene_overrepresentation(
		df_entrez,
		.sample = sample,
		.entrez = entrez,
		.do_test = do_test,
		species="Homo sapiens",
   gene_set=c("C2")
	)