Description Usage Arguments Details Value See Also Examples
View source: R/genome2PQtree.R
Convert a one-dimensional genome map into a two-dimensional
PQ-structure that can be used as compgenome
input for the
functions computeRearrs
, summarizeBlocks
, and
genomeRearrPlot
1 | genome2PQtree(genomemap)
|
genomemap |
Data frame representing the genome map to be converted,
containing the mandatory columns |
genomemap
must contain the mandatory columns $marker
(a
character or integer vector that gives the IDs of markers), $scaff
(a character vector that gives the ID of the genome segment of origin of
each marker), $start
and $end
(numeric vectors that specify
the location of each marker on its genome segment), and $strand
(a
vector of "+"
and "-"
characters that indicate the reading
direction of each marker). Additional columns are ignored and may store
custom information. Markers need to be ordered by their map position within
each genome segment, for example by running the
orderGenomeMap
function.
Important: If the converted genome map is used as
compgenome
input for the function computeRearrs
, it is
crucial that all genome segments in the $scaff
column of
genomemap
represent contiguous sets of genetic markers. Genome
segments that are (potentially) overlapping, such as minor scaffolds or
contigs that were not assembled into chromosomes and might in fact be part
of assembled chromosomes or enclosed in other scaffolds, need to be
excluded from genomemap
prior to its conversion.
A data frame encoding the marker order in genomemap
as a
two-dimensional PQ-structure (i.e., in PQ-tree format).
IDs in the $car
column of the output are assigned according to the
order of genome segments as they appear in the $scaff
column of
genomemap
. Markers that are NA
in the genome map are excluded
from the output.
For additional details on the output format see the description of the
"compgenome"
class in the Details section of the
checkInfile
function, or the package vignette.
The unambiguously ordered genome segments in the one-dimensional genome map
genomemap
can be seen as a subclass of PQ-trees, where each
genome segment is encoded by a single Q-node that only contains
leaves as children. Accordingly, the returned PQ-structure has
exactly five columns: $marker
, $orientation
, $car
, one
column for node type (always "Q"
), and one for node element (ranging
from 1
to the number of non-NA
markers within a genome
segment).
orderGenomeMap
, checkInfile
,
computeRearrs
, summarizeBlocks
,
genomeRearrPlot
.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | ## Not run:
## Exclude potentially overlapping minor scaffolds from genome map:
SIM_markers_chr <- SIM_markers[is.element(SIM_markers$scaff,
c("2L", "2R", "3L", "3R", "4", "X")), ]
## Convert genome map into PQ-structure:
SIM_compgenome <- genome2PQtree(SIM_markers_chr)
## Print a translation between names of genome segments and CAR IDs:
head(data.frame(chr = unique(SIM_markers_chr$scaff),
car = 1:length(unique(SIM_markers_chr$scaff)),
stringsAsFactors = FALSE))
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.