heaps: Heaps law estimate

Description Usage Arguments Details Value Author(s) References See Also Examples

View source: R/powerlaw.R

Description

Estimating if a pan-genome is open or closed based on a Heaps law model.

Usage

1
heaps(pan.matrix, n.perm = 100)

Arguments

pan.matrix

A Panmat object, see panMatrix for details.

n.perm

The number of random permutations of genome ordering.

Details

An open pan-genome means there will always be new gene clusters observed as long as new genomes are being sequenced. This may sound controversial, but in a pragmatic view, an open pan-genome indicates that the number of new gene clusters to be observed in future genomes is ‘large’ (but not literally infinite). Opposite, a closed pan-genome indicates we are approaching the end of new gene clusters.

This function is based on a Heaps law approach suggested by Tettelin et al (2008). The Heaps law model is fitted to the number of new gene clusters observed when genomes are ordered in a random way. The model has two parameters, an intercept and a decay parameter called alpha. If alpha>1.0 the pan-genome is closed, if alpha<1.0 it is open.

The number of permutations, n.perm, should be as large as possible, limited by computation time. The default value of 100 is certainly a minimum.

Word of caution: The Heaps law assumes independent sampling. If some of the genomes in the data set form distinct sub-groups in the population, this may affect the results of this analysis severely.

Value

A vector of two estimated parameters: The Intercept and the decay parameter alpha. If alpha<1.0 the pan-genome is open, if alpha>1.0 it is closed.

Author(s)

Lars Snipen and Kristian Hovde Liland.

References

Tettelin, H., Riley, D., Cattuto, C., Medini, D. (2008). Comparative genomics: the bacterial pan-genome. Current Opinions in Microbiology, 12:472-477.

See Also

binomixEstimate, chao, rarefaction.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
# Loading a Panmat object in the micropan package 
data(list="Mpneumoniae.blast.panmat",package="micropan")

# Estimating population openness
h.est <- heaps(Mpneumoniae.blast.panmat,n.perm=500)
if(h.est[2]>1){
  cat("Population is closed with alpha =",h.est[2], "\n")
} else {
  cat("Population is open with alpha =",h.est[2], "\n")
}

micropan documentation built on May 29, 2017, 11:57 a.m.