haploNet: Haplotype Networks
In pegas: Population and Evolutionary Genetics Analysis System

haploNet

R Documentation

Haplotype Networks

Description

haploNet computes a haplotype network. There is a plot method and two conversion functions towards other packages.

Usage

haploNet(h, d = NULL, getProb = TRUE)
## S3 method for class 'haploNet'
print(x, ...)
## S3 method for class 'haploNet'
plot(x, size = 1, col, bg, col.link, lwd, lty,
     shape = "circles", pie = NULL, labels, font, cex, col.lab, scale.ratio,
     asp = 1, legend = FALSE, fast = FALSE, show.mutation,
     threshold = c(1, 2), xy = NULL, ...)
## S3 method for class 'haploNet'
as.network(x, directed = FALSE, altlinks = TRUE, ...)
## S3 method for class 'haploNet'
as.igraph(x, directed = FALSE, use.labels = TRUE,
        altlinks = TRUE, ...)
## S3 method for class 'haploNet'
as.phylo(x, quiet, ...)
## S3 method for class 'haploNet'
as.evonet(x, ...)

Arguments

`h`	an object of class `"haplotype"`.
`d`	an object giving the distances among haplotypes (see details).
`getProb`	a logical specifying whether to calculate Templeton's probabilities (see details).
`x`	an object of class `"haploNet"`.
`size`	a numeric vector giving the diameter of the circles representing the haplotypes: this is in the same unit than the links and eventually recycled.
`col`	a character vector specifying the colours of the circles; eventually recycled.
`bg`	a character vector (or a function) specifying either the colours of the background of the symbols (if `pie = NULL`), or the colours of the slices of the pies (could be a function); eventually recycled.
`col.link`	a character vector specifying the colours of the links; eventually recycled.
`lwd`	a numeric vector giving the width of the links; eventually recycled.
`lty`	idem for the line types.
`shape`	the symbol shape used for the haplotypes (eventually recycled): `"circles"`, `"squares"`, `"diamonds"` (can be abbreviated).
`pie`	a matrix used to draw pie charts for each haplotype; its number of rows must be equal to the number of haplotypes.
`labels`	a logical specifying whether to identify the haplotypes with their labels (the default).
`font`	the font used for these labels (bold by default); must be an integer between 1 and 4.
`cex`	a numerical specifying the character expansion of the labels.
`col.lab`	the color of the labels.
`scale.ratio`	the ratio of the scale of the links representing the number of steps on the scale of the circles representing the haplotypes. It may be needed to give a value greater than one to avoid overlapping circles.
`asp`	the aspect ratio of the plot. Do not change the default unless you want to distort your network.
`legend`	a logical specifying whether to draw the legend, or a vector of length two giving the coordinates where to draw the legend; `FALSE` by default. If `TRUE`, the user is asked to click where to draw the legend.
`fast`	a logical specifying whether to optimize the spacing of the circles; `FALSE` by default.
`show.mutation`	an integer value: if 0, nothing is drawn on the links; if 1, the mutations are shown with small segments on the links; if 2, they are shown with small dots; if 3, the number of mutations are printed on the links.
`threshold`	a numeric vector with two values (or 0) giving the lower and upper numbers of mutations for alternative links to be displayed. If `threshold = 0`, alternative links are not drawn at all.
`directed`	a logical specifying whether the network is directed (`FALSE` by default).
`use.labels`	a logical specifying whether to use the original labels in the returned network.
`altlinks`	whether to output the alternative links when converting to another class; `TRUE` by default.
`quiet`	whether to give a warning when reticulations are dropped when converting a network into a tree.
`xy`	the coordinates of the nodes (see `replot`).
`...`	further arguments passed to `plot`.

Details

By default, the haplotype network is built using an infinite site model (i.e., uncorrected or Hamming distance) of DNA sequences and pairwise deletion of missing data (see dist.dna). Users may specify their own distance with the argument d. There is no check of labels, so the user must make sure that the distances are ordered in the same way than the haplotypes.

The probabilities calculated with Templeton et al.'s (1992) method may give non-finite values with very divergent sequences, resulting in an error from haploNet. If this happens, it may be better to use getProb = FALSE.

If two haplotypes are very different, haploNet will likely fail (error during integration due to non-finite values).

Value

haploNet returns an object of class "haploNet" which is a matrix where each row represents a link in the network, the first and second columns give the numbers of the linked haplotypes, the third column, named "step", gives the number of steps in this link, and the fourth column, named "Prob", gives the probability of a parsimonious link as given by Templeton et al. (1992). There are three additional attributes: "freq", the absolute frequencies of each haplotype, "labels", their labels, and "alter.links", the alternative links of the network.

as.network and as.igraph return objects of the appropriate class.

Note

Plotting haplotype networks is a difficult task. There is a vignette in pegas (see vignette("PlotHaploNet")) giving some information on this isseu. You may also see two posts on r-sig-genetics (July 2022) that give some tricks in the situation when one haplotype is abundant and the others are in low frequencies (the symbols are likely to overlap a lot by default):

https://stat.ethz.ch/pipermail/r-sig-genetics/2022-July/000237.html

https://stat.ethz.ch/pipermail/r-sig-genetics/2022-July/000238.html

The first post explains how to use the package network in combination with pegas, and the second one gives a trick that works with pegas only for a similar result.

Author(s)

Emmanuel Paradis, Klaus Schliep

References

Templeton, A. R., Crandall, K. A. and Sing, C. F. (1992) A cladistic analysis of phenotypic association with haplotypes inferred from restriction endonuclease mapping and DNA sequence data. III. Cladogram estimation. Genetics, 132, 619–635.

Examples

## generate some artificial data from 'woodmouse':
data(woodmouse)
x <- woodmouse[sample(15, size = 110, replace = TRUE), ]
h <- haplotype(x)
(net <- haploNet(h))
plot(net)
## symbol sizes equal to haplotype sizes:
plot(net, size = attr(net, "freq"), fast = TRUE)
plot(net, size = attr(net, "freq"))
plot(net, size = attr(net, "freq"), scale.ratio = 2, cex = 0.8)

pegas documentation built on May 29, 2024, 2:27 a.m.