top_genes: Derivation of Top Genes

View source: R/top_genes.R

top_genesR Documentation

Derivation of Top Genes

Description

[Experimental]

top_genes() creates a HermesDataTopGenes object, which extends data.frame. It contains two columns:

  • expression: containing the statistic values calculated by summary_fun across columns.

  • name: the gene names.

The corresponding autoplot() method then visualizes the result as a barplot.

Usage

top_genes(
  object,
  assay_name = "counts",
  summary_fun = rowMeans,
  n_top = if (is.null(min_threshold)) 10L else NULL,
  min_threshold = NULL
)

## S4 method for signature 'HermesDataTopGenes'
autoplot(
  object,
  x_lab = "HGNC gene names",
  y_lab = paste0(object@summary_fun_name, "(", object@assay_name, ")"),
  title = "Top most expressed genes"
)

Arguments

object

(AnyHermedData)
input.

assay_name

(string)
name of the assay to use for the sorting of genes.

summary_fun

(function)
summary statistics function to apply across the samples in the assay resulting in a numeric vector with one value per gene.

n_top

(count or NULL)
selection criteria based on number of entries.

min_threshold

(number or NULL )
selection criteria based on a minimum summary statistics threshold.

x_lab

(string)
x-axis label.

y_lab

(string)
y-axis label.

title

(string)
plot title.

Details

  • The data frame is sorted in descending order of expression and only the top entries according to the selection criteria are included.

  • Note that exactly one of the arguments n_top and min_threshold must be provided.

Value

A HermesDataTopGenes object.

Functions

  • autoplot(HermesDataTopGenes): Creates a bar plot from a HermesDataTopGenes object, where the y axis shows the expression statistics for each of the top genes on the x-axis.

Examples

object <- hermes_data

# Default uses average of raw counts across samples to rank genes.
top_genes(object)

# Instead of showing top 10 genes, can also set a minimum threshold on average counts.
top_genes(object, n_top = NULL, min_threshold = 50000)

# We can also use the maximum of raw counts across samples, by specifying a different
# summary statistics function.
result <- top_genes(object, summary_fun = rowMax)

# Finally we can produce barplots based on the results.
autoplot(result, title = "My top genes")
autoplot(result, y_lab = "Counts", title = "My top genes")

insightsengineering/hermes documentation built on Dec. 15, 2024, 8:07 a.m.