enrichment_analysis: Enrichment analysis

Description Usage Arguments Value Functions Author(s) See Also Examples

Description

Plotting functions for enrichment analysis of multiHMM or combinedMultiHMM objects with any annotation of interest, specified as a GRanges-class object.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
plotFoldEnrichHeatmap(
  hmm,
  annotations,
  what = "combinations",
  combinations = NULL,
  marks = NULL,
  plot = TRUE,
  logscale = TRUE
)

plotEnrichCountHeatmap(
  hmm,
  annotation,
  bp.around.annotation = 10000,
  max.rows = 1000,
  combinations = NULL,
  colorByCombinations = sortByCombinations,
  sortByCombinations = is.null(sortByColumns),
  sortByColumns = NULL
)

plotEnrichment(
  hmm,
  annotation,
  bp.around.annotation = 10000,
  region = c("start", "inside", "end"),
  num.intervals = 20,
  what = "combinations",
  combinations = NULL,
  marks = NULL,
  statistic = "fold",
  logscale = TRUE
)

Arguments

hmm

A combinedMultiHMM or multiHMM object or a file that contains such an object.

annotations

A list() with GRanges-class objects containing coordinates of multiple annotations The names of the list entries will be used to name the return values.

what

One of c('combinations','peaks','counts','transitions') specifying on which feature the statistic is calculated.

combinations

A vector with combinations for which the enrichment will be calculated, e.g. combinations = c('[H3K4me3]','[H3K4me3+H3K27me3]'). If NULL all combinations will be considered.

marks

A vector with marks for which the enrichment is plotted. If NULL all marks will be considered.

plot

A logical indicating whether the plot or an array with the fold enrichment values is returned.

logscale

Set to TRUE to plot fold enrichment on log-scale. Ignored if statistic = 'fraction'.

annotation

A GRanges-class object with the annotation of interest.

bp.around.annotation

An integer specifying the number of basepairs up- and downstream of the annotation for which the enrichment will be calculated.

max.rows

An integer specifying the number of randomly subsampled rows that are plotted from the annotation object. This is necessary to avoid crashing for heatmaps with too many rows.

colorByCombinations

A logical indicating whether or not to color the heatmap by combinations.

sortByCombinations

A logical indicating whether or not to sort the heatmap by combinations.

sortByColumns

An integer vector specifying the column numbers by which to sort the rows. If sortByColumns is specified, will force sortByCombinations=FALSE.

region

A combination of c('start','inside','end') specifying the region of the annotation for which the enrichment will be calculated. Select 'start' if you have a point-sized annotation like transcription start sites. Select c('start','inside','end') if you have long annotations like genes.

num.intervals

Number of intervals for enrichment 'inside' of annotation.

statistic

The statistic to calculate. Either 'fold' for fold enrichments or 'fraction' for fraction of bins falling into the annotation.

Value

A ggplot object containing the plot or a list() with ggplot objects if several plots are returned. For plotFoldEnrichHeatmap a named array with fold enrichments if plot=FALSE.

Functions

Author(s)

Aaron Taudt

See Also

plotting

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
### Get an example multiHMM ###
file <- system.file("data","multivariate_mode-combinatorial_condition-SHR.RData",
                    package="chromstaR")
model <- get(load(file))

### Obtain gene coordinates for rat from biomaRt ###
library(biomaRt)
ensembl <- useEnsembl(biomart='ENSEMBL_MART_ENSEMBL', dataset='rnorvegicus_gene_ensembl')
genes <- getBM(attributes=c('ensembl_gene_id', 'chromosome_name', 'start_position',
                           'end_position', 'strand', 'external_gene_name',
                           'gene_biotype'),
              mart=ensembl)
# Transform to GRanges for easier handling
genes <- GRanges(seqnames=paste0('chr',genes$chromosome_name),
                ranges=IRanges(start=genes$start, end=genes$end),
                strand=genes$strand,
                name=genes$external_gene_name, biotype=genes$gene_biotype)
# Rename chrMT to chrM
seqlevels(genes)[seqlevels(genes)=='chrMT'] <- 'chrM'
print(genes)

### Make the enrichment plots ###
# We expect promoter [H3K4me3] and bivalent-promoter signatures [H3K4me3+H3K27me3]
# to be enriched at transcription start sites.
   plotEnrichment(hmm = model, annotation = genes, bp.around.annotation = 15000) +
   ggtitle('Fold enrichment around genes') +
   xlab('distance from gene body')
 
# Plot enrichment only at TSS. We make use of the fact that TSS is the start of a gene.
   plotEnrichment(model, genes, region = 'start') +
   ggtitle('Fold enrichment around TSS') +
   xlab('distance from TSS in [bp]')
# Note: If you want to facet the plot because you have many combinatorial states you
# can do that with
   plotEnrichment(model, genes, region = 'start') +
   facet_wrap(~ combination)
 
# Another form of visualization that shows every TSS in a heatmap
# If transparency is not supported try to plot to pdf() instead.
   tss <- resize(genes, width = 3, fix = 'start')
   plotEnrichCountHeatmap(model, tss) +
   theme(strip.text.x = element_text(size=6))
 
# Fold enrichment with different biotypes, showing that protein coding genes are
# enriched with (bivalent) promoter combinations [H3K4me3] and [H3K4me3+H3K27me3],
# while rRNA is enriched with the empty [] and repressive combinations [H3K27me3].
   tss <- resize(genes, width = 3, fix = 'start')
   biotypes <- split(tss, tss$biotype)
   plotFoldEnrichHeatmap(model, annotations=biotypes) + coord_flip()

ataudt/chromstaR documentation built on Dec. 26, 2021, 12:07 a.m.