plot_topn: Plot the representation of the top-n genes in the total...

View source: R/plot_distribution.R

plot_topnR Documentation

Plot the representation of the top-n genes in the total counts / sample.

Description

One question we might ask is: how much do the most abundant genes in a samples comprise the entire sample? This plot attempts to provide a visual hint toward answering this question. It does so by rank-ordering all the genes in every sample and dividing their counts by the total number of reads in that sample. It then smooths the points to provide the resulting trend. The steeper the resulting line, the more over-represented these top-n genes are. I suspect, but haven't tried yet, that the inflection point of the resulting curve is also a useful diagnostic in this question.

Usage

plot_topn(
  data,
  plot_title = NULL,
  num = 100,
  expt_names = NULL,
  plot_labels = NULL,
  label_chars = 10,
  plot_legend = FALSE,
  ...
)

Arguments

data

Dataframe/matrix/whatever for performing topn-plot.

plot_title

A title for the plot.

num

The N in top-n genes, if null, do them all.

expt_names

Column or character list of sample names.

plot_labels

Method for labelling the lines.

label_chars

Maximum number of characters before abbreviating samples.

plot_legend

Add a legend to the plot?

...

Extra arguments, currently unused.

Value

List containing the ggplot2


elsayed-lab/hpgltools documentation built on May 9, 2024, 5:02 a.m.