genomeRearrPlot: Genome Rearrangements Plot

Description Usage Arguments Details Value References See Also Examples

View source: R/genomeRearrPlot.R

Description

Generate a plot that shows synteny blocks between a focal and a compared genome in columns, and information on their alignment and rearrangements in rows, for a given set of focal genome segments

Usage

1
2
3
4
5
6
7
genomeRearrPlot(BLOCKS, compgenome, ordfocal, remstr = "", main = "",
  remThld = 0.05, mar = NULL, pad = 0, y0pad = 5,
  uniqueCarColor = TRUE, sortColsBySize = TRUE, plotelem = c(1, 1, 1,
  1, 1), simplifyTags = TRUE, blockwidth = 1, yaxlab = NULL,
  carColors = NULL, carTextColors = NULL, cex.main = 2,
  cex.axis = 1.2, cex.text = 1, font.main = 3, makepdf = FALSE,
  newdev = TRUE, filename = "rearr.pdf", colormodel = "srgb")

Arguments

BLOCKS

A list of lists generated with the summarizeBlocks function. The top-level list elements of BLOCKS are focal genome segments, and the lower-level list elements contain information on synteny blocks and rearrangements for each focal genome segment.

compgenome

Data frame representing the compared genome (e.g., an ancestral genome reconstruction, or an extant genome), with the first three columns $marker, $orientation, and $car, followed by columns alternating node type and node element. Markers need to be ordered by their node elements. It must be the same data frame that was used to generate the list BLOCKS with the summarizeBlocks function.

ordfocal

Character vector with the IDs of the focal genome segments that will be plotted. Have to match (a subset of) names of the top-level list elements of BLOCKS.

remstr

String that should be removed from the IDs in ordfocal to simplify the y-axis labels. Only relevant when yaxlab = NULL.

main

Title of the plot.

remThld

A numeric value between 0 (inclusive) and 0.5 (exclusive). Controls whether components of rearrangements that are less parsimonious to have changed position relative to the alternative components will be plotted. To plot all components, remThld needs to be smaller than remWgt used in the computeRearrs function.

mar

A numerical vector of the form c(bottom, left, top, right) that specifies the margins on the four sides of the plot. The default mar = NULL sets the margins automatically.

pad

A numeric value of 0 or greater that sets the amount of space between all plot margins and the actual plot area.

y0pad

A numeric value of 0 or greater that sets the amount of additional space between the bottom plot margin and the bottom plot area. Setting this value too small may result in some rearrangements for the bottom-most focal genome segment to be outside the bottom plot area (and thus to be invisible in the plot).

uniqueCarColor

Logical. If TRUE, CARs are uniquely colored across all focal genome segments. If FALSE, CARs are colored separately for each focal genome segment based on the number of markers per CAR (forces sortColsBySize = TRUE).

sortColsBySize

Logical. If TRUE, carColors and carTextColors are assigned based on the number of markers per CAR, so that the first color is allocated to the largest CAR.

plotelem

A numerical vector of the form c(nor, bid, bor, eid, rea) that determines which synteny block information is visualized. nor is the alignment orientation of the node, bid is the block ID, bor is the block orientation within its node, eid is the element ID within its node, and rea are rearrangements. The information is plotted when the value is 1, and omitted when 0.

simplifyTags

Logical. If TRUE, duplicated rearrangement tags, if any, are excluded from the plot. Note that this will work properly only when remWgt used in the computeRearrs function was set to a value >0.

blockwidth

A numeric value that specifies the relative width of synteny blocks (i.e., columns) in the plot.

yaxlab

Annotations of the y-axis. Must be the same length as ordfocal. The default yaxlab = NULL uses as annotations the names in ordfocal, simplified through removal of the string in remstr.

carColors

Character vector with the color names used for coloring CARs. If the number of CARs is greater than length(carColors), remaining CARs are colored in grayscale. carColors must have the same length as carTextColors.

carTextColors

Character vector with the color names used for coloring CAR IDs. carTextColors must have the same length as carColors.

cex.main

Numerical value that specifies the magnification of the main title.

cex.axis

Numerical value that specifies the magnification of the axis annotation.

cex.text

Numerical value that specifies the magnification of text within the plot area.

font.main

Font for the title of the plot. 1 corresponds to plain text, 2 to bold face, 3 to italic, and 4 to bold italic.

makepdf

Logical. Save plot as PDF. See filename and colormodel.

newdev

Logical. Opens a new default graphics device (but not "RStudioGD") via dev.new. Only relevant when makepdf = FALSE.

filename

Character string that gives the name of the PDF file when makepdf = TRUE.

colormodel

Character string that gives the color model for the PDF when makepdf = TRUE. Allowed values are "srgb", "cmyk", "gray", or "grey".

Details

Parameters BLOCKS, compgenome, and ordfocal need to be specified, all other parameters have default or automatic settings.

When makepdf = TRUE or newdev = TRUE, the width and height of the graphic will be set automatically. The dimensions are determined in inches, thus makepdf = FALSE and newdev = TRUE will produce an error or not work correctly when the default units of the default graphics device are not inches (such as bmp, jpeg, png, or tiff). This can be avoided by setting the default graphics device to one that has inches as default units. Setting both makepdf = FALSE and newdev = FALSE will allow to specify alternative, user-defined dimensions of the graphic. See examples below.

Colors are assigned to CARs by size, unless sortColsBySize = FALSE. When carColors = NULL, 47 easily distinguishable default colors are used for coloring CARs. The first 14 colors are color blindness friendly and were obtained from mkweb.bcgsc.ca/biovis2012. carTextColors are either black or white dependent on the hue of the default carColors.

Value

A plot to the default graphics device (but not "RStudioGD") or a PDF file.

The plot visualizes the data contained in BLOCKS for each focal genome segment in ordfocal, arranged along the y-axis. Each synteny block is represented by a column (corresponding to rows in BLOCKS), and information on each block is visualized in rows (corresponding to columns in BLOCKS). Note that separate blocks are also generated when the hierarchical structure of the underlying PQ-tree changes, therefore not all column boundaries are caused by a rearrangement. Rows provide information on the structure of each PQ-tree and its alignment to the focal genome, and whether blocks are part of different classes of rearrangements. For details on PQ-trees see the description of the "compgenome" class in the Details section of the checkInfile function, Booth & Lueker 1976, Chauve & Tannier 2008, or the package vignette.

For each focal genome segment, the top row gives the number of markers within each block, followed by a row that gives the IDs of the CARs. The remaining rows are optional and are controlled by the values in plotelem.

Up to four rows (depending on the values in the argument plotelem) describe the structure of each PQ-tree and its alignment to the focal genome, and are stacked for each level of the PQ-tree hierarchy. The first row (nor) gives the alignment orientation of the PQ-tree node to the focal genome, with white rectangles and "+" indicating ascending (i.e., standard), and black rectangles and "-" indicating descending (i.e., inverted) alignment. Nodes that have no alignment direction (e.g., single-marker nodes) are in gray, and P-nodes (which have no fixed ordering and thus no direction) have gray shaded rectangles. The second row (bid) gives the block ID. For Q-nodes, IDs are consecutive and start at 1, separately for each node and each hierarchy level, and reflect the order of synteny blocks. Identical IDs mean that blocks might be joined, but are split either by an insertion of another CAR or because of a change in the hierarchical structure of the underlying PQ-tree. Block IDs with ".1" or ".2" suffixes indicate rearrangements that may arise by either an inversion or a syntenic move between adjacent blocks, but that were classified as syntenic move for the sake of parsimony. For P-nodes, the bid row is empty unless the node is part of a rearrangement, in which case IDs indicate different rearrangements, but not block order. The third row (bor) gives the block orientation within its node. It has the same color and symbol coding as nor above. For example, a "-" block within a "+" node indicates either an inversion or a syntenic move between adjacent blocks. The fourth row (eid) gives the range of element IDs for each block within its node and for its level of hierarchy. These IDs correspond to the node elements in the odd columns of compgenome (note that some IDs within blocks or in-between might be missing when markers in the compared genome are absent from the focal genome).

The final set of rows (rea) indicates whether blocks are part of different classes of rearrangements. Horizontal lines that are at identical height denote the same rearrangement (potentially disrupted by inserted CARs). Green are class I nonsyntenic moves (NM1); blue are class II nonsyntenic moves (NM2); purple are syntenic moves (SM); maroon are inversions (IV). Inversions that involve only a single marker (i.e., markers with switched orientation) are indicated by a short vertical rather than a horizontal line. Lighter coloration denotes smaller weights for rearrangement tags in the respective matrices in BLOCKS. Unless the argument remThld is set to a value smaller than that of remWgt used in the computeRearrs function, only lines for blocks that are more parsimonious to have changed position relative to alternative blocks are plotted. If simplifyTags = FALSE, all tags for NM2 and SM will be plotted for completeness, i.e., including those that are duplicated due to the functioning of the underlying algorithm in computeRearrs.

References

Booth, K.S. & Lueker, G.S. (1976). Testing for the consecutive ones property, interval graphs, and graph planarity using PQ-Tree algorithms. Journal of Computer and System Sciences, 13, 335–379. doi: 10.1016/S0022-0000(76)80045-1.

Chauve, C. & Tannier, E. (2008). A methodological framework for the reconstruction of contiguous regions of ancestral genomes and its application to mammalian genomes. PLOS Computational Biology, 4, e1000234. doi: 10.1371/journal.pcbi.1000234.

See Also

checkInfile, computeRearrs, summarizeBlocks, genomeImagePlot. For more information about arguments that are passed to other functions, see dev.new, pdf, plot, par.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
SYNT<-computeRearrs(TOY24_focalgenome, TOY24_compgenome, doubled = TRUE)
BLOCKS<-summarizeBlocks(SYNT, TOY24_focalgenome, TOY24_compgenome,
                        c("1","2","3"))

genomeRearrPlot(BLOCKS, TOY24_compgenome, c("1", "2", "3"), main = "TOY24")

## Not run: 

## make PDF (automatically determine the width and height of the graphic)
genomeRearrPlot(BLOCKS, TOY24_compgenome, c("1", "2", "3"), main = "TOY24",
                makepdf = TRUE, newdev = FALSE, filename = "rearr.pdf")

## make PDF (default dimensions, i.e., square format)
pdf(file = "rearr.pdf")
genomeRearrPlot(BLOCKS, TOY24_compgenome, c("1", "2", "3"), main = "TOY24",
                makepdf = FALSE, newdev = FALSE)
dev.off()

## plot in R Studio window
op <- options(device = "RStudioGD")
genomeRearrPlot(BLOCKS, TOY24_compgenome, c("1", "2", "3"), main = "TOY24",
                newdev = FALSE)
options(op)

## make EPS, and set user-specified dimensions
setEPS()
postscript("rearr.eps", width=4.5,height=6.0,pointsize=9)
genomeRearrPlot(BLOCKS, TOY24_compgenome, c("1", "2"), main = "TOY24",
                pad = 1, y0pad = 1, makepdf = FALSE, newdev = FALSE)
dev.off()

## End(Not run)

dorolin/rearrvisr documentation built on Aug. 6, 2020, 1:32 a.m.