summarizeRearrs: Summarize Rearrangements

Description Usage Arguments Details Value See Also Examples

View source: R/summarizeRearrs.R

Description

For each focal genome segment, summarize the number and type of rearrangement events and the number of breakpoints

Usage

1
2
summarizeRearrs(SYNT, focalgenome, compgenome, ordfocal, remWgt = 0.05,
  remThld = 0)

Arguments

SYNT

A list of matrices that store data on different classes of rearrangements and additional information. SYNT must have been generated with the computeRearrs function (optionally filtered with the filterRearrs function).

focalgenome

Data frame representing the focal genome, containing the mandatory columns $marker, $scaff, $start, $end, and $strand, and optional further columns. Markers need to be ordered by their map position.

compgenome

Data frame representing the compared genome (e.g., an ancestral genome reconstruction, or an extant genome), with the first three columns $marker, $orientation, and $car, followed by columns alternating node type and node element. Markers need to be ordered by their node elements. compgenome must be the same data frame that was used to generate the list SYNT with the computeRearrs function.

ordfocal

Character vector with the IDs of the focal genome segments that will be summarized. Have to match (a subset of) IDs in focalgenome$scaff.

remWgt

A numeric value between 0 (inclusive) and 0.5 (exclusive). Needs to match the value for remWgt used in the computeRearrs function.

remThld

A numeric value between 0 (inclusive) and 0.5 (exclusive). Controls whether breakpoints for components of rearrangements that are less parsimonious to have changed position relative to the alternative components will be output. To output all breakpoints, remThld needs to be smaller than remWgt used in the computeRearrs function.

Details

Only rearrangements that have components tagged with values larger than remWgt will be counted. For proper functioning, remWgt should correspond to the value that has been used to generate SYNT. The number of nonsyntenic moves is computed as the maximum of class I and class II nonsyntenic moves per focal genome segment.

The number of breakpoints is computed based on the getBreakpoints function. To include the breakpoint of origin for nonsyntenic and syntenic moves in the estimate, remThld needs to be set to zero (which is the default). Note that this may nevertheless underestimate the number of breakpoints as the location of origin is not determined for all rearrangements. When the input is a filtered version of SYNT (i.e., filtered with the filterRearrs function), the number of breakpoints may be overestimated. This can be prevented by increasing the value of remThld to match the value of remWgt. However, this may then underestimate the number of breakpoints as some breakpoints of origin will not be counted. Breakpoints that fall on identical positions are only counted once.

Value

A matrix with the number of identified nonsyntenic moves, syntenic moves, inversions, and breakpoints in columns, for the set of focal genome segments in ordfocal in rows.

See Also

computeRearrs, filterRearrs, getBreakpoints.

Examples

1
2
3

dorolin/rearrvisr documentation built on Aug. 6, 2020, 1:32 a.m.