fluidity: Computing genomic fluidity for a pan-genome

Description Usage Arguments Details Value Author(s) References See Also Examples

View source: R/genomedistances.R

Description

Computes the genomic fluidity, which is a measure of population diversity.

Usage

1
fluidity(pan.matrix, n.sim = 10)

Arguments

pan.matrix

A pan-matrix, see panMatrix for details.

n.sim

An integer specifying the number of random samples to use in the computations.

Details

The genomic fluidity between two genomes is defined as the number of unique gene families divided by the total number of gene families (Kislyuk et al, 2011). This is averaged over n.sim random pairs of genomes to obtain a population estimate.

The genomic fluidity between two genomes describes their degree of overlap with respect to gene cluster content. If the fluidity is 0.0, the two genomes contain identical gene clusters. If it is 1.0 the two genomes are non-overlapping. The difference between a Jaccard distance (see distJaccard) and genomic fluidity is small, they both measure overlap between genomes, but fluidity is computed for the population by averaging over many pairs, while Jaccard distances are computed for every pair. Note that only presence/absence of gene clusters are considered, not multiple occurrences.

The input pan.matrix is typically constructed by panMatrix.

Value

A vector with two elements, the mean fluidity and its sample standard deviation over the n.sim computed values.

Author(s)

Lars Snipen and Kristian Hovde Liland.

References

Kislyuk, A.O., Haegeman, B., Bergman, N.H., Weitz, J.S. (2011). Genomic fluidity: an integrative view of gene diversity within microbial populations. BMC Genomics, 12:32.

See Also

panMatrix, distJaccard.

Examples

1
2
3
4
5
# Loading a pan-matrix in this package
data(xmpl.panmat)

# Fluidity based on this pan-matrix
fluid <- fluidity(xmpl.panmat)

micropan documentation built on July 15, 2020, 5:08 p.m.