plot_topn: Plot the representation of the top-n genes in the total...
In elsayed-lab/hpgltools: A pile of (hopefully) useful R functions

plot_topn

R Documentation

Plot the representation of the top-n genes in the total counts / sample.

Description

One question we might ask is: how much do the most abundant genes in a samples comprise the entire sample? This plot attempts to provide a visual hint toward answering this question. It does so by rank-ordering all the genes in every sample and dividing their counts by the total number of reads in that sample. It then smooths the points to provide the resulting trend. The steeper the resulting line, the more over-represented these top-n genes are. I suspect, but haven't tried yet, that the inflection point of the resulting curve is also a useful diagnostic in this question.

Usage

plot_topn(
  data,
  plot_title = NULL,
  num = 100,
  expt_names = NULL,
  plot_labels = NULL,
  label_chars = 10,
  plot_legend = FALSE,
  ...
)

Arguments

`data`	Dataframe/matrix/whatever for performing topn-plot.
`plot_title`	A title for the plot.
`num`	The N in top-n genes, if null, do them all.
`expt_names`	Column or character list of sample names.
`plot_labels`	Method for labelling the lines.
`label_chars`	Maximum number of characters before abbreviating samples.
`plot_legend`	Add a legend to the plot?
`...`	Extra arguments, currently unused.