plot_numerical: Plot numerical data over regions or regions summarized over...

Description Usage Arguments Value Examples

View source: R/visualize.R

Description

This function produces either histograms over facet, or x-y scatterplots over facet. In the case of histograms over facets, the All distribution (hollow histogram with red outline) is the distribution of x over all the regions in the data. The facet specific distributions (solid gray) are the distribution of x over the regions in each facet. For example, a CpG with associated percent methylation annotated to a CpG island and a promoter will count once in the All distribution, but will count once each in the CpG island and promoter facet distributions.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
plot_numerical(
  annotated_regions,
  x,
  y,
  facet,
  facet_order,
  bin_width = 10,
  plot_title,
  x_label,
  y_label,
  legend_facet_label,
  legend_cum_label,
  quiet = FALSE
)

Arguments

annotated_regions

A GRanges returned from annotate_regions(). If the data is not summarized, the data is at the region level. If it is summarized, it represents the average or standard deviation of the regions by the character vector used for by in summarize_numerical().

x

A string indicating the column of the GRanges to use for the x-axis.

y

A string indicating the column of the GRanges to use for the y-axis. If missing, a a histogram over x will be plotted. If not missing, a scatterplot is plotted.

facet

A string, or character vector of two strings, indicating indicating which categorical variable(s) in the GRanges to make ggplot2 facets over. When two facets are given, the first entry is the vertical facet and the second entry is the horizontal facet. Default is annot.type.

facet_order

A character vector, or list of character vectors if facet has length 2, which gives the order of the facets, and can be used to subset the column in the GRanges used for the facet. For example, if facet = 'annot.type', then the annotations maybe subsetted to just CpG annotations. Default is NULL, meaning all annotations in their default order are used.

bin_width

An integer indicating the bin width of the histogram used for score. Default 10. Select something appropriate for the data. NOTE: This is only used if y is NULL.

plot_title

A string used for the title of the plot. If missing, no title is displayed.

x_label

A string used for the x-axis label. If missing, no x-axis label is displayed.

y_label

A string used for the y-axis label. If missing, no y-axis label is displayed.

legend_facet_label

A string used to label the gray bar portion of the legend. Defaults to "x in facet".

legend_cum_label

A string used to label the red outline portion of the legend. Defaults to "All in x".

quiet

Print progress messages (FALSE) or not (TRUE).

Value

A ggplot object which can be viewed by calling it, or saved with ggplot2::ggsave.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
   # An example with multi-columned data

   # Get premade CpG annotations
   data('annotations', package = 'annotatr')

   dm_file = system.file('extdata', 'IDH2mut_v_NBM_multi_data_chr9.txt.gz', package = 'annotatr')
   extraCols = c(diff_meth = 'numeric', mu1 = 'numeric', mu0 = 'numeric')
   dm_regions = read_regions(con = dm_file, extraCols = extraCols,
       rename_score = 'pval', rename_name = 'DM_status', format = 'bed')
   dm_regions = dm_regions[1:1000]

   # Annotate the regions
   dm_annots = annotate_regions(
       regions = dm_regions,
       annotations = annotations,
       ignore.strand = TRUE)

   # Plot histogram of group 1 methylation rates across the CpG annotations.
   # NOTE: Overall distribution (everything in \code{facet_order})
   # is plotted in each facet for comparison.
   dm_vs_regions_mu1 = plot_numerical(
       annotated_regions = dm_annots,
       x = 'mu1',
       facet = 'annot.type',
       facet_order = c('hg19_cpg_islands','hg19_cpg_shores',
           'hg19_cpg_shelves','hg19_cpg_inter'),
       bin_width = 5,
       plot_title = 'Group 1 Methylation over CpG Annotations',
       x_label = 'Group 1 Methylation')

   # Plot histogram of group 1 methylation rates across the CpG annotations
   # crossed with DM_status
   dm_vs_regions_diffmeth = plot_numerical(
       annotated_regions = dm_annots,
       x = 'diff_meth',
       facet = c('annot.type','DM_status'),
       facet_order = list(c('hg19_genes_promoters','hg19_genes_5UTRs','hg19_cpg_islands'), c('hyper','hypo','none')),
       bin_width = 5,
       plot_title = 'Group 0 Region Methylation In Genes',
       x_label = 'Methylation Difference')

   # Can also use the result of annotate_regions() to plot two numerical
   # data columns against each other for each region, and facet by annotations.
   dm_vs_regions_annot = plot_numerical(
       annotated_regions = dm_annots,
       x = 'mu0',
       y = 'mu1',
       facet = 'annot.type',
       facet_order = c('hg19_cpg_islands','hg19_cpg_shores',
           'hg19_cpg_shelves','hg19_cpg_inter'),
       plot_title = 'Region Methylation: Group 0 vs Group 1',
       x_label = 'Group 0',
       y_label = 'Group 1')

   # Another example, but using differential methylation status as the facets.
   dm_vs_regions_name = plot_numerical(
       annotated_regions = dm_annots,
       x = 'mu0',
       y = 'mu1',
       facet = 'DM_status',
       facet_order = c('hyper','hypo','none'),
       plot_title = 'Region Methylation: Group 0 vs Group 1',
       x_label = 'Group 0',
       y_label = 'Group 1')

annotatr documentation built on Nov. 8, 2020, 8:16 p.m.