plot_categorical: Plot a categorical data variable over another

plot_categoricalR Documentation

Plot a categorical data variable over another

Description

Given a GRanges of annotated regions from annotate_regions(), visualize the the distribution of categorical data fill in categorical data x. A bar representing the distribution of all fill in x will be added according to the contents of fill. This is the distribution over all values of x. Additionally, when annotated_random is not missing, a "Random Regions" bar shows the distribution of random regions over fill.

Usage

plot_categorical(
  annotated_regions,
  annotated_random,
  x,
  fill = NULL,
  x_order = NULL,
  fill_order = NULL,
  position = "stack",
  plot_title,
  legend_title,
  x_label,
  y_label,
  quiet = FALSE
)

Arguments

annotated_regions

The GRanges result of annotate_regions().

annotated_random

The GRanges result of annotate_regions() on the randomized regions created from randomize_regions(). Random regions can only be used with fill == 'annot.type'.

x

One of 'annot.type' or a categorical data column, indicating whether annotation classes or data classes will appear on the x-axis.

fill

One of 'annot.type', a categorical data column, or NULL, indicating whether annotation classes or data classes will fill the bars. If NULL then the bars will be the total counts of the x classes.

x_order

A character vector that subsets and orders the x classes. Default NULL, uses existing values.

fill_order

A character vector that subsets and orders the fill classes. Default NULL, uses existing values.

position

A string which has the same possible values as in ggplot2::geom_bar(..., position), i.e., 'stack', 'fill', 'dodge', etc.

plot_title

A string used for the title of the plot. If missing, no title is displayed.

legend_title

A string used for the legend title to describe fills (if fill is not NULL). Default displays corresponding variable name.

x_label

A string used for the x-axis label. If missing, corresponding variable name used.

y_label

A string used for the y-axis label. If missing, corresponding variable name used.

quiet

Print progress messages (FALSE) or not (TRUE).

Details

For example, if a differentially methylated region has the categorical label hyper, and is annotated to a promoter, a 5UTR, two exons, and an intron. Each annotation will appear in the All bar once. Likewise for the hyper bar if the differential methylation status is chosen as x with annot.type chosen as fill.

Value

A ggplot object which can be viewed by calling it, or saved with ggplot2::ggsave.

Examples

   # Get premade CpG annotations
   data('annotations', package = 'annotatr')

   dm_file = system.file('extdata', 'IDH2mut_v_NBM_multi_data_chr9.txt.gz', package = 'annotatr')
   extraCols = c(diff_meth = 'numeric', mu1 = 'numeric', mu0 = 'numeric')
   dm_regions = read_regions(con = dm_file, extraCols = extraCols, genome = 'hg19',
       rename_score = 'pval', rename_name = 'DM_status', format = 'bed')
   dm_regions = dm_regions[1:1000]

   dm_annots = annotate_regions(
       regions = dm_regions,
       annotations = annotations,
       ignore.strand = TRUE)

   dm_order = c(
       'hyper',
       'hypo')
   cpg_order = c(
       'hg19_cpg_islands',
       'hg19_cpg_shores',
       'hg19_cpg_shelves',
       'hg19_cpg_inter')

   dm_vn = plot_categorical(
       annotated_regions = dm_annots,
       x = 'DM_status',
       fill = 'annot.type',
       x_order = dm_order,
       fill_order = cpg_order,
       position = 'fill',
       legend_title = 'knownGene Annotations',
       x_label = 'DM status',
       y_label = 'Proportion')

   # Create randomized regions
   dm_rnd_regions = randomize_regions(regions = dm_regions)
   dm_rnd_annots = annotate_regions(
       regions = dm_rnd_regions,
       annotations = annotations,
       ignore.strand = TRUE)

   dm_vn_rnd = plot_categorical(
       annotated_regions = dm_annots,
       annotated_random = dm_rnd_annots,
       x = 'DM_status',
       fill = 'annot.type',
       x_order = dm_order,
       fill_order = cpg_order,
       position = 'fill',
       legend_title = 'knownGene Annotations',
       x_label = 'DM status',
       y_label = 'Proportion')


rcavalcante/annotatr documentation built on March 25, 2023, 9:51 a.m.