plotDistanceDistrib: plot the distribution of intergenic distances

Description Usage Arguments Value Author(s) See Also Examples

View source: R/plotDistanceDistrib.R

Description

Plots the distribution of intergenic distances by side (upstream/downstream) and orientation (Same or Opposite Strand) for different sets of genes, using density or violin plot.

Usage

1
2
3
4
5
6
7
plotDistanceDistrib(
  distdf,
  groupcolumn = "GeneSet",
  type = c("ridge", "violin", "jitterbox"),
  genesetcols = NULL,
  newlabs = NULL
)

Arguments

distdf

A data frame or tibble containing distances for different sets of genes (ideally defined in a "GeneSet" column)

groupcolumn

A character string with the name of the column containing the description of the gene sets

type

A character string in c("ridge", "violin", "jitterbox") indicating the type of plot/geom to use

genesetcols

An optional named character vector with colors for the different gene sets (names should correspond to levels of the groupcolumn)

newlabs

An optional named character vector with new labels for the different gene sets (names should correspond to levels of the groupcolumn)

Value

A ggplot object

Author(s)

Pascal GP Martin

See Also

ggplot2, ggplot, geom_violinh, geom_boxploth, geom_density_ridges

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
## Obtain gene neighborhood information:
  GeneNeighbors <- getGeneNeighborhood(Genegr)

## Get a (random) set of (100) genes:
  set.seed(123)
  randGenes <- sample(names(Genegr), 100)

## Select a set of genes with a short upstream distances:
  ### Extract upstream distances for all genes:
    updist <- getDistSide(GeneNeighbors,
                          names(Genegr),
                          Side = "Upstream")
  ### Define sampling probabilities inversely proportional to the distance:
    probs <- (max(updist$Distance) - updist$Distance) /
                 sum(max(updist$Distance) - updist$Distance)
  ### Sample 100 genes using these probabilities:
    set.seed(1234)
    closeUpstream <- sample(updist$GeneName, 100, prob=probs)

## Extract all upstream/downstream distances for different gene sets:
  mydist <- dist2Neighbors(GeneNeighbors,
                           list("All Genes" = names(Genegr),
                                "Random Genes" = randGenes,
                                "Close Upstream" = closeUpstream))

## Finally, plot the distribution of distances (density plot):
  plotDistanceDistrib(mydist)
## Using violin plots and specific colors and labels
  plotDistanceDistrib(mydist,
                      type = "violin",
                      genesetcols = c("All Genes" = "grey",
                                      "Random Genes" = "lightblue",
                                      "Close Upstream" = "pink"),
                      newlabs = c("All Genes" = "ALL",
                                  "Random Genes" = "Random",
                                  "Close Upstream" = "Close"))
 ## Adding jitter points and a boxplot under the density plot
  plotDistanceDistrib(mydist, type = "jitterbox")

pgpmartin/GeneNeighborhood documentation built on Sept. 2, 2021, 6:37 a.m.