# Constructing pan-genome trees

### Description

Creates a pan-genome tree based on a pan-matrix and a distance function.

### Usage

1 2 | ```
panTree(pan.matrix, dist.FUN = distManhattan, nboot = 0,
linkage = "average", ...)
``` |

### Arguments

`pan.matrix` |
A |

`dist.FUN` |
A valid distance function, see below. |

`nboot` |
Number of bootstrap samples. |

`linkage` |
The linkage function, see below. |

`...` |
Additional parameters passed on to the specified distance function, see Details below. |

### Details

A pan-genome tree is a graphical display of the genomes in a pan-genome study, based on
some pan-matrix (Snipen & Ussery, 2010). `panTree`

is a constructor that computes a
`Pantree`

object, use `plot.Pantree`

to actually plot the tree.

The parameter dist.FUN must be a function that takes as input a numerical matrix (`Panmat`

object) and returns a `dist`

object. See `distManhattan`

or
`distJaccard`

for examples of such functions. Any additional arguments (...) are
passed on to this function.

If you want to have bootstrap-values in the tree, set nboot to some appropriate number (e.g. nboot=100).

The tree is created by `hclust`

(hierarchical clustering) using the average
linkage function, which is according to Snipen & Ussery, 2010. You may specify alternatives by the
parameter linkage, see `hclust`

for details.

### Value

This function returns a `Pantree`

object, which is a small (S3) extension to a
`list`

with 4 components. These components are named Htree, Nboot,
Nbranch and Dist.FUN.

Htree is a `hclust`

object. This is the actual tree.
Nboot is the number of bootstrap samples.
Nbranch is a vector listing the number of times each split/clade in the tree was observed
in the bootstrap procedure.
Dist.FUN is the name of the distance function used to construct the tree.

### Author(s)

Lars Snipen and Kristian Hovde Liland.

### References

Snipen, L., Ussery, D.W. (2010). Standard operating procedure for computing pangenome trees. Standards in Genomic Sciences, 2:135-141.

### See Also

`panMatrix`

, `distManhattan`

, `distJaccard`

,
`plot.Pantree`

.

### Examples

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 | ```
# Loading a Panmat object, constructing a tree and plotting it
data(list="Mpneumoniae.blast.panmat",package="micropan")
my.tree <- panTree(Mpneumoniae.blast.panmat)
plot(my.tree)
# Computing some weights to be used in the distManhattan
# function below...
w <- geneWeights(Mpneumoniae.blast.panmat,type="shell")
# Creating another tree with scaled and weighted distances and bootstrap values
my.tree <- panTree(Mpneumoniae.blast.panmat, scale=0.1, weights=w)
# ...and plotting with alternative labels and colors from Mpneumoniae.table
data(list="Mpneumoniae.table",package="micropan")
labels <- Mpneumoniae.table$Strain
names(labels) <- Mpneumoniae.table$GID.tag
cols <- Mpneumoniae.table$Color
names(cols) <- Mpneumoniae.table$GID.tag
plot(my.tree, leaf.lab=labels, col=cols,cex=0.8, xlab="Shell-weighted Manhattan distances")
``` |