argmax.geno | R Documentation |
Uses the Viterbi algorithm to identify the most likely sequence of underlying genotypes, given the observed multipoint marker data, with possible allowance for genotyping errors.
argmax.geno(cross, step=0, off.end=0, error.prob=0.0001,
map.function=c("haldane","kosambi","c-f","morgan"),
stepwidth=c("fixed", "variable", "max"))
cross |
An object of class |
step |
Maximum distance (in cM) between positions at which the
genotypes are reconstructed, though for |
off.end |
Distance (in cM) past the terminal markers on each chromosome to which the genotype reconstructions will be carried. |
error.prob |
Assumed genotyping error rate used in the calculation of the penetrance Pr(observed genotype | true genotype). |
map.function |
Indicates whether to use the Haldane, Kosambi, Carter-Falconer or Morgan map function when converting genetic distances into recombination fractions. |
stepwidth |
Indicates whether the intermediate points should with
fixed or variable step sizes. We recommend using
|
We use the Viterbi algorithm to calculate
\arg \max_v \Pr(g = v | O)
where
g
is the underlying sequence of genotypes and O
is the
observed marker genotypes.
This is done by calculating
\gamma_k(v_k) = \max_{v_1, \ldots, v_{k-1}} \Pr(g_1 = v_1,
\ldots, g_k = v_k, O_1, \ldots, O_k)
for k = 1, \ldots, n
and then tracing back through the
sequence.
The input cross
object is returned with a component,
argmax
, added to each component of cross$geno
.
The argmax
component is a matrix of size [n.ind x n.pos], where
n.pos is the
number of positions at which the reconstructed genotypes were obtained,
containing the most likely sequences of underlying genotypes.
Attributes "error.prob"
, "step"
, and "off.end"
are set to the values of the corresponding arguments, for later
reference.
The Viterbi algorithm can behave badly when step
is small but
positive. One may observe quite different results for different values
of step
.
The problem is that, in the presence of data like A----H
, the
sequences AAAAAA
and HHHHHH
may be more likely than any
one of the sequences AAAAAH
, AAAAHH
, AAAHHH
,
AAHHHH
, AHHHHH
, AAAAAH
. The Viterbi algorithm
produces a single "most likely" sequence of underlying genotypes.
Karl W Broman, broman@wisc.edu
Lange, K. (1999) Numerical analysis for statisticians. Springer-Verlag. Sec 23.3.
Rabiner, L. R. (1989) A tutorial on hidden Markov models and selected applications in speech recognition. Proceedings of the IEEE 77, 257–286.
sim.geno
, calc.genoprob
,
fill.geno
data(fake.f2)
fake.f2 <- argmax.geno(fake.f2, step=2, off.end=5, err=0.01)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.