Description Usage Arguments Value Chromosomes and Sequences See Also
Normalise chromosome labels.
1 | normChr(x)
|
x |
Vector of chromosome labels. |
Vector of normalised chromosome labels.
Although for some yeast genome assemblies they are equivalent, chromosomes
(cell structures containing genetic material) are treated by shmootl
as being distinct from sequences (linkage units that corresponding to all
or part of a chromosome). This distinction is necessary to allow for use of
reference genomes in which multiple sequences map to a single chromosome.
(see genomeOpt
for more on setting a reference genome.) While
every sequence must be mapped to a specific chromosome, it is sequences,
and not chromosomes, that are used as the primary linkage unit throughout
this package.
A yeast nuclear chromosome can be represented by an Arabic number in the
range 1
to 16
, inclusive; or by the Roman numeral corresponding
to the chromosome number. The mitochondrial chromosome can be represented by
the number 17
or a capital 'M'
. A chromosome label can include
one of the optional prefixes 'c'
or 'chr'
. So for example, any
of the following can represent chromosome 4:
4
:an Arabic number
IV
:a Roman numeral
c04
:a zero-padded Arabic number with prefix 'c'
chrIV
:a Roman numeral with prefix 'chr'
Using the function normChr
, all of these representations can
be normalised to one consistent form: a zero-padded Arabic number
(i.e. '04'
). This is used internally by shmootl as a
normalised representation, and is recommended.
For genomes in which every sequence represents a specific chromosome, the
sequence label is identical to the chromosome label. In other cases, the
sequence label should be a chromosome label followed by a sequence-specific
label (e.g. contig ID), separated by an underscore. For example, a contig
'1D22'
that maps to chromosome 4 can be represented as follows:
4_1D22
IV_1D22
c04_1D22
chrIV_1D22
Variations in chromosome representation are possible as before, but the
sequence-specific label must be consistent. As with chromosomes, the function
normSeq
can be used to normalise all of these forms to
one consistent representation: a zero-padded Arabic number followed by the
sequence-specific label (i.e. '04_1D22'
). This representation is
recommended, as it is used internally by shmootl as a standard way to
label sequences in a genome lacking a one-to-one correspondence between
sequences and chromosomes.
Other chromosome/sequence functions: formatChr
,
formatSeq
, isNormChr
,
isNormSeq
, normSeq
,
orderChr
, orderSeq
,
rankChr
, rankSeq
,
sortChr
, sortSeq
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.