Description Slots Chromosomes and Sequences Phenotype IDs Time-Series Phenotypes Locus IDs Sample IDs Tetradic Samples
A CrossInfo object holds yeast cross information for a specific cross
object. The contents of its slots should match its corresponding object.
To view documentation for any methods of this class, input the name of
the method preceded by a question mark (e.g. ?getPhenotypes
).
seq
A non-redundant character vector of sequence identifiers, with
the name of each element being the name of the given sequence. See also
setSequences
and getSequences
.
pheno
A non-redundant vector of cross phenotypes, with the name of each
element being the syntactically valid name of the phenotype ID (as output by
the function make.names
). See also setPhenotypes
and
getPhenotypes
.
markers
A data.frame
with information about the non-redundant
set of markers in a cross
(see setMarkers
and
getMarkers
). This can optionally contain information about
the sequences corresponding to each marker (see setMarkerSeqs
and
getMarkerSeqs
).
samples
A data.frame
with information about the samples in a
cross
. At minimum, this must contain indices of the samples in the
given cross
dataset. If relevant, it can contain information about
sample IDs (see setSamples
and getSamples
), strain indices
(see setStrainIndices
and getStrainIndices
), and tetrad
indices (see setTetradIndices
and getTetradIndices
).
alleles
A vector of cross allele symbols.
See setAlleles
and getAlleles
.
genotypes
A vector of cross genotype symbols.
See setGenotypes
and getGenotypes
.
crosstype
Cross type. See setCrosstype
and getCrosstype
.
Although for some yeast genome assemblies they are equivalent, chromosomes
(cell structures containing genetic material) are treated by shmootl
as being distinct from sequences (linkage units that corresponding to all
or part of a chromosome). This distinction is necessary to allow for use of
reference genomes in which multiple sequences map to a single chromosome.
(see genomeOpt
for more on setting a reference genome.) While
every sequence must be mapped to a specific chromosome, it is sequences,
and not chromosomes, that are used as the primary linkage unit throughout
this package.
A yeast nuclear chromosome can be represented by an Arabic number in the
range 1
to 16
, inclusive; or by the Roman numeral corresponding
to the chromosome number. The mitochondrial chromosome can be represented by
the number 17
or a capital 'M'
. A chromosome label can include
one of the optional prefixes 'c'
or 'chr'
. So for example, any
of the following can represent chromosome 4:
4
:an Arabic number
IV
:a Roman numeral
c04
:a zero-padded Arabic number with prefix 'c'
chrIV
:a Roman numeral with prefix 'chr'
Using the function normChr
, all of these representations can
be normalised to one consistent form: a zero-padded Arabic number
(i.e. '04'
). This is used internally by shmootl as a
normalised representation, and is recommended.
For genomes in which every sequence represents a specific chromosome, the
sequence label is identical to the chromosome label. In other cases, the
sequence label should be a chromosome label followed by a sequence-specific
label (e.g. contig ID), separated by an underscore. For example, a contig
'1D22'
that maps to chromosome 4 can be represented as follows:
4_1D22
IV_1D22
c04_1D22
chrIV_1D22
Variations in chromosome representation are possible as before, but the
sequence-specific label must be consistent. As with chromosomes, the function
normSeq
can be used to normalise all of these forms to
one consistent representation: a zero-padded Arabic number followed by the
sequence-specific label (i.e. '04_1D22'
). This representation is
recommended, as it is used internally by shmootl as a standard way to
label sequences in a genome lacking a one-to-one correspondence between
sequences and chromosomes.
A phenotype ID can be any valid item ID (see
package overview), although it may be changed by
R/qtl to ensure that it is syntactically valid. In such cases, the
original phenotype ID can be obtained from the 'info'
attribute of
a cross
that has been loaded with readCrossCSV
(see CrossInfo
).
A set of phenotypes can be designated as a time-series by naming each
phenotype with the time point at which phenotype observations were made
(e.g. '0.0'
, '1.0'
, '2.0'
). Time points can be in
any unit, but must be non-negative, monotonically increasing, and have
a consistent time step. If some time points are missing, the resulting
gap in time must be a multiple of the time step.
Map locus IDs can be any valid item ID (see
package overview), and are of two main types:
markers and pseudomarkers. A marker ID is any valid locus ID that is not a
pseudomarker ID. Pseudomarker IDs are used by R/qtl for inter-marker
loci. They indicate the reference sequence and genetic map position of the
locus (e.g. 'c04.loc33'
for a locus at position 33cM on chromosome IV).
A sample ID can be any valid item ID (see package overview). Duplicate sample IDs are permissible, but only if referring to replicate samples of the same strain. Different strains can have different numbers of replicates, but samples from a given strain must be in consecutive rows.
Sample IDs can be used to indicate tetrad membership, even in a cross
object with some missing samples. In a tetradic dataset, sample IDs with a
numeric suffix (e.g. 'FS101'
) are taken as segregant numbers and used
to infer the tetrad to which each sample ID belongs, assuming that tetrads
are labelled sequentially, with four samples per tetrad. Sample IDs can also
have an alphanumeric suffix (e.g. 'FS01A'
), where the numeric part is
a tetrad number and the final letter (i.e. 'A'
, 'B'
,
'C'
, or 'D'
) identifies the individual tetrad member.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.