mjn | R Documentation |
This function computes the median-joining network (MJN) as described by Bandelt et al. (1999).
mjn(x, epsilon = 0, max.n.cost = 10000, prefix = "median.vector_",
quiet = FALSE)
## S3 method for class 'mjn'
plot(x, shape = c("circles", "diamonds"),
bg = c("green", "slategrey"), labels = FALSE, ...)
x |
a matrix (or data frame) of DNA sequences or binary 0/1
data; an object of class |
epsilon |
tolerance parameter. |
max.n.cost |
the maximum number of costs to be computed. |
prefix |
the prefix used to label the median vectors. |
quiet |
a logical value; by default, the progress of the calculatins is printed. |
shape , bg |
the default shapes and colours for observed haplotypes and median vectors. |
labels |
by default, the labels of the haplotypes are printed. |
... |
other arguments passed to |
MJN is a network method where unobserved sequences (the median
vectors) are reconstructed and included in the final network. Unlike
mst
, rmst
, and msn
, mjn
works with
the original sequences, the distances being calculated internally
using a Hamming distance method (with dist(x, "manhattan")
for
binary data or dist.dna(x, "N")
for DNA sequences).
The parameter epsilon
controls how the search for new median
vectors is performed: the larger this parameter, the wider the search
(see the example with binary data).
If the sequences are very divergent, the search for new median vectors
can take a very long time. The argument max.n.cost
controls how
many such vectors are added to the network (the default value should
avoid the function to run endlessly).
The arguments shape
and bg
must be of length two (unlike
in plot.haploNet
). It is possible to have more
flexibility when plotting the MJN by changing its class, for instance
with the output in the examples below: class(nt0) <- "haplotNet"
.
an object of class c("mjn", "haploNet")
with an extra attribute
(data) containing the original data together with the median vectors.
Since pegas 1.0, mjn
is expected to run in reasonable
times (less than 15 sec with 100 sequences). Bandelt et al. (1999)
reported long computing times because of the need to compute a lot of
median vectors. Running times also depend on the level of polymorphism
in the data (see above).
Emmanuel Paradis
Bandelt, H. J., Forster, P. and Rohl, A. (1999) Median-joining networks for inferring intraspecific phylogenies. Molecular Biology and Evolution, 16, 37–48.
haploNet
, mst
## data in Table 1 of Bandelt et al. (1999):
x <- c(0, 0, 0, 0, 0, 0, 0, 0, 0,
1, 1, 1, 1, 0, 0, 0, 0, 0,
1, 0, 0, 0, 1, 1, 1, 0, 0,
0, 1, 1, 1, 1, 1, 0, 1, 1)
x <- matrix(x, 4, 9, byrow = TRUE)
rownames(x) <- LETTERS[1:4]
(nt0 <- mjn(x))
(nt1 <- mjn(x, 1))
(nt2 <- mjn(x, 2))
plot(nt0)
## Not run:
## same like in Fig. 4 of Bandelt et al. (1999):
plotNetMDS(nt2, dist(attr(nt2, "data"), "manhattan"), 3)
## End(Not run)
## data in Table 2 of Bandelt et al. (1999):
z <- list(c("g", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a"),
c("a", "g", "g", "a", "a", "a", "a", "a", "a", "a", "a", "a"),
c("a", "a", "a", "g", "a", "a", "a", "a", "a", "a", "g", "g"),
c("a", "a", "a", "a", "g", "g", "a", "a", "a", "a", "g", "g"),
c("a", "a", "a", "a", "a", "a", "a", "a", "g", "g", "c", "c"),
c("a", "a", "a", "a", "a", "a", "g", "g", "g", "g", "a", "a"))
names(z) <- c("A1", "A2", "B1", "B2", "C", "D")
z <- as.matrix(as.DNAbin(z))
(ntz <- mjn(z, 2))
## Not run:
## same like in Fig. 5 of Bandelt et al. (1999):
plotNetMDS(ntz, dist.dna(attr(ntz, "data"), "N"), 3)
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.