genotypeDiversity
calculates diversity statistics based on
genotype frequencies, using a distance matrix to assign individuals to
genotypes. The Shannon
and Simpson
functions are also
available to calculate these statistics directly from a vector
of frequencies.
1 2 3 4 5 6 7 8 9 10 11 12  genotypeDiversity(genobject, samples = Samples(genobject),
loci = Loci(genobject),
d = meandistance.matrix(genobject, samples, loci,
all.distances = TRUE,
distmetric = Lynch.distance),
threshold = 0, index = Shannon, ...)
Shannon(p, base = exp(1))
Simpson(p)
Simpson.var(p)

genobject 
An object of the class 
samples 
An optional character vector indicating a subset of samples to analyze. 
loci 
An optional character vector indicating a subset of loci to analyze. 
d 
A list such as that produced by 
threshold 
The maximum genetic distance between two samples that can be considered to be the same genotype. 
index 
The diversity index to calculate. This should be 
... 
Additional arguments to pass to 
p 
A vector of counts. 
base 
The base of the logarithm for calculating the Shannon index. This is

genotypeDiversity
runs assignClones
on distance
matrices for individual loci and then for all loci, for each seperate
population. The results of assignClones
are used to
calculate a vector of genotype frequencies, which is passed to
index
.
Shannon
calculates the Shannon index, which is:
∑ p_i/N ln(p_i/N)
(or log base 2 or any other base, using the base
argument) given
a vector p of genotype counts, where N is the sum of those counts.
Simpson
calculates the Simpson index, which is:
∑ p_i(p_i  1)/(N(N 1))
Simpson.var
calculates the variance of the Simpson index:
\frac{4N(N1)(N2)∑ p_{i}^3 + 2N(N1)∑ p_{i}^2  2N(N1)(2N3)(∑ p_{i}^2)^2}{[N(N1)]^2}
The variance of the Simpson index can be used to calculate a confidence
interval, for example the results of Simpson
plus or minus twice
the square root of the results of Simpson.var
would be the 95%
confidence interval.
A matrix of diversity index results, with populations in rows and
loci in columns. The final column is called "overall"
and gives
the results when all loci are analyzed together.
Lindsay V. Clark
Shannon, C. E. (1948) A mathematical theory of communication. Bell System Technical Journal 27:379–423 and 623–656.
Simpson, E. H. (1949) Measurement of diversity. Nature 163:688.
Lowe, A., Harris, S. and Ashton, P. (2004) Ecological Genetics: Design, Analysis, and Application. WileyBlackwell.
ArnaudHaond, S., Duarte, M., Alberto, F. and Serrao, E. A. (2007) Standardizing methods to address clonality in population studies. Molecular Ecology 16:5115–5139.
http://www.comparingpartitions.info/index.php?link=Tut4
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17  # set up dataset
mydata < new("genambig", samples=c("a","b","c"), loci=c("F","G"))
Genotypes(mydata, loci="F") < list(c(115,118,124),c(115,118,124),
c(121,124))
Genotypes(mydata, loci="G") < list(c(162,170,174),c(170,172),
c(166,180,182))
Usatnts(mydata) < c(3,2)
# get genetic distances
mydist < meandistance.matrix(mydata, all.distances=TRUE)
# calculate diversity under various conditions
genotypeDiversity(mydata, d=mydist)
genotypeDiversity(mydata, d=mydist, base=2)
genotypeDiversity(mydata, d=mydist, threshold=0.3)
genotypeDiversity(mydata, d=mydist, index=Simpson)
genotypeDiversity(mydata, d=mydist, index=Simpson.var)

Questions? Problems? Suggestions? Tweet to @rdrrHQ or email at ian@mutexlabs.com.
Please suggest features or report bugs with the GitHub issue tracker.
All documentation is copyright its authors; we didn't write any of that.