This function computes the median-joining network (MJN) as described by Bandelt et al. (1999).

mjn(x, epsilon = 0, max.n.cost = 10000, prefix = "median.vector_", quiet = FALSE) ## S3 method for class 'mjn' plot(x, shape = c("circles", "diamonds"), bg = c("green", "slategrey"), labels = FALSE, ...)

`x` |
a matrix (or data frame) of DNA sequences or binary 0/1
data; an object of class |

`epsilon` |
tolerance parameter. |

`max.n.cost` |
the maximum number of costs to be computed. |

`prefix` |
the prefix used to label the median vectors. |

`quiet` |
a logical value; by default, the progress of the calculatins is printed. |

`shape, bg` |
the default shapes and colours for observed haplotypes and median vectors. |

`labels` |
by default, the labels of the haplotypes are printed. |

`...` |
other arguments passed to |

MJN is a network method where unobserved sequences (the median
vectors) are reconstructed and included in the final network. Unlike
`mst`

, `rmst`

, and `msn`

, `mjn`

works with
the original sequences, the distances being calculated internally
using a Hamming distance method (with `dist(x, "manhattan")`

for
binary data or `dist.dna(x, "N")`

for DNA sequences).

The parameter `epsilon`

controls how the search for new median
vectors is performed: the larger this parameter, the wider the search
(see the example with binary data).

If the sequences are very divergent, the search for new median vectors
can take a very long time. The argument `max.n.cost`

controls how
many such vectors are added to the network (the default value should
avoid the function to run endlessly).

The arguments `shape`

and `bg`

must be of length two (unlike
in `plot.haploNet`

). It is possible to have more
flexibility when plotting the MJN by changing its class, for instance
with the output in the examples below: `class(nt0) <- "haplotNet"`

.

an object of class `c("mjn", "haploNet")`

with an extra attribute
(data) containing the original data together with the median vectors.

Since pegas 1.0, `mjn`

is expected to run in reasonable
times (less than 15 sec with 100 sequences). Bandelt et al. (1999)
reported long computing times because of the need to compute a lot of
median vectors. Running times also depend on the level of polymorphism
in the data (see above).

Emmanuel Paradis

Bandelt, H. J., Forster, P. and Rohl, A. (1999) Median-joining networks
for inferring intraspecific phylogenies. *Molecular Biology and
Evolution*, **16**, 37–48.

`haploNet`

, `mst`

## data in Table 1 of Bandelt et al. (1999): x <- c(0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 1, 1, 0, 0, 0, 1, 1, 1, 1, 1, 0, 1, 1) x <- matrix(x, 4, 9, byrow = TRUE) rownames(x) <- LETTERS[1:4] (nt0 <- mjn(x)) (nt1 <- mjn(x, 1)) (nt2 <- mjn(x, 2)) plot(nt0) ## Not run: ## same like in Fig. 4 of Bandelt et al. (1999): plotNetMDS(nt2, dist(attr(nt2, "data"), "manhattan"), 3) ## End(Not run) ## data in Table 2 of Bandelt et al. (1999): z <- list(c("g", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a"), c("a", "g", "g", "a", "a", "a", "a", "a", "a", "a", "a", "a"), c("a", "a", "a", "g", "a", "a", "a", "a", "a", "a", "g", "g"), c("a", "a", "a", "a", "g", "g", "a", "a", "a", "a", "g", "g"), c("a", "a", "a", "a", "a", "a", "a", "a", "g", "g", "c", "c"), c("a", "a", "a", "a", "a", "a", "g", "g", "g", "g", "a", "a")) names(z) <- c("A1", "A2", "B1", "B2", "C", "D") z <- as.matrix(as.DNAbin(z)) (ntz <- mjn(z, 2)) ## Not run: ## same like in Fig. 5 of Bandelt et al. (1999): plotNetMDS(ntz, dist.dna(attr(ntz, "data"), "N"), 3) ## End(Not run)

