Description Usage Arguments Details Value References See Also Examples
This function simplifies the process necessary for performing AMOVA in R. It
gives user the choice of utilizing either the ade4 or the pegas
implementation of AMOVA. See ade4::amova()
(ade4) and pegas::amova()
(pegas) for details on the specific implementation.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 | poppr.amova(
x,
hier = NULL,
clonecorrect = FALSE,
within = TRUE,
dist = NULL,
squared = TRUE,
freq = TRUE,
correction = "quasieuclid",
sep = "_",
filter = FALSE,
threshold = 0,
algorithm = "farthest_neighbor",
threads = 1L,
missing = "loci",
cutoff = 0.05,
quiet = FALSE,
method = c("ade4", "pegas"),
nperm = 0
)
|
x |
a genind, genclone, genlight, or snpclone object |
hier |
a hierarchical formula that defines your population
hierarchy. (e.g.: |
clonecorrect |
|
within |
|
dist |
an optional distance matrix calculated on your data. If this is
set to |
squared |
if a distance matrix is supplied, this indicates whether or not it represents squared distances. |
freq |
|
correction |
a |
sep |
Deprecated. As of poppr version 2, this argument serves no purpose. |
filter |
|
threshold |
a number indicating the minimum distance two MLGs must be separated by to be considered different. Defaults to 0, which will reflect the original (naive) MLG definition. |
algorithm |
determines the type of clustering to be done.
|
threads |
|
missing |
specify method of correcting for missing data utilizing
options given in the function |
cutoff |
specify the level at which missing data should be
removed/modified. See |
quiet |
|
method |
Which method for calculating AMOVA should be used? Choices refer to package implementations: "ade4" (default) or "pegas". See details for differences. |
nperm |
the number of permutations passed to the pegas implementation of amova. |
The poppr implementation of AMOVA is a very detailed wrapper for the
ade4 implementation. The output is an ade4::amova()
class list that
contains the results in the first four elements. The inputs are contained
in the last three elements. The inputs required for the ade4 implementation
are:
a distance matrix on all unique genotypes (haplotypes)
a data frame defining the hierarchy of the distance matrix
a genotype (haplotype) frequency table.
All of this data can be constructed from a genind or genlight object, but can be daunting for a novice R user. This function automates the entire process. Since there are many variables regarding genetic data, some points need to be highlighted:
The hierarchy is defined by different
population strata that separate your data hierarchically. These strata are
defined in the strata slot of genind,
genlight, genclone, and
snpclone objects. They are useful for defining the
population factor for your data. See the function strata()
for details on
how to properly define these strata.
Heterozygosities within
genotypes are sources of variation from within individuals and can be
quantified in AMOVA. When within = TRUE
, poppr will split genotypes into
haplotypes with the function make_haplotypes()
and use those to calculate
within-individual variance. No estimation of phase is made. This acts much
like the default settings for AMOVA in the Arlequin software package.
Within individual variance will not be calculated for haploid individuals
or dominant markers as the haplotypes cannot be split further. Setting
within = FALSE
uses the euclidean distance of the allele frequencies
within each individual. Note: within = TRUE
is incompatible with
filter = TRUE
. In this case, within
will be set to FALSE
With the ade4 implementation of
AMOVA (utilized by poppr), distances must be Euclidean (due to the
nature of the calculations). Unfortunately, many genetic distance measures
are not always euclidean and must be corrected for before being analyzed.
Poppr automates this with three methods implemented in ade4,
quasieuclid()
, lingoes()
, and cailliez()
. The correction of these
distances should not adversely affect the outcome of the analysis.
Filtering multilocus genotypes is performed by
mlg.filter()
. This can necessarily only be done AMOVA tests that do not
account for within-individual variance. The distance matrix used to
calculate the amova is derived from using mlg.filter()
with the option
stats = "distance"
, which reports the distance between multilocus
genotype clusters. One useful way to utilize this feature is to correct for
genotypes that have equivalent distance due to missing data. (See example
below.)
Both ade4 and pegas have
implementations of AMOVA, both of which are appropriately called "amova".
The ade4 version is faster, but there have been questions raised as to the
validity of the code utilized. The pegas version is slower, but careful
measures have been implemented as to the accuracy of the method. It must be
noted that there appears to be a bug regarding permuting analyses where
within individual variance is accounted for (within = TRUE
) in the pegas
implementation. If you want to perform permutation analyses on the pegas
implementation, you must set within = FALSE
. In addition, while clone
correction is implemented for both methods, filtering is only implemented
for the ade4 version.
As of poppr version 2.7.0, this function is able to calculate phi statistics for within-individual variance for polyploid data with full dosage information. When a data set does not contain full dosage information for all samples, then the resulting pseudo-haplotypes will contain missing data, which would result in an incorrect estimate of variance.
Instead, the AMOVA will be performed on the distance matrix derived from
allele counts or allele frequencies, depending on the freq
option. This
has been shown to be robust to estimates with mixed ploidy (Ronfort et al.
1998; Meirmans and Liu 2018). If you wish to brute-force your way to
estimating AMOVA using missing values, you can split your haplotypes with
the make_haplotypes()
function.
One strategy for addressing ambiguous dosage in your polyploid data set
would be to convert your data to polysat's genambig
class with the
as.genambig()
, estimate allele frequencies with polysat::deSilvaFreq()
,
and use these frequencies to randomly sample alleles to fill in the
ambiguous alleles.
a list of class amova
from the ade4 or pegas package. See
ade4::amova()
or pegas::amova()
for details.
Excoffier, L., Smouse, P.E. and Quattro, J.M. (1992) Analysis of molecular variance inferred from metric distances among DNA haplotypes: application to human mitochondrial DNA restriction data. Genetics, 131, 479-491.
Ronfort, J., Jenczewski, E., Bataillon, T., and Rousset, F. (1998). Analysis of population structure in autotetraploid species. Genetics, 150, 921–930.
Meirmans, P., Liu, S. (2018) Analysis of Molecular Variance (AMOVA) for Autopolyploids Submitted.
ade4::amova()
, pegas::amova()
, clonecorrect()
, diss.dist()
,
missingno()
, ade4::is.euclid()
, strata()
, make_haplotypes()
,
as.genambig()
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 | data(Aeut)
strata(Aeut) <- other(Aeut)$population_hierarchy[-1]
agc <- as.genclone(Aeut)
agc
amova.result <- poppr.amova(agc, ~Pop/Subpop)
amova.result
amova.test <- randtest(amova.result) # Test for significance
plot(amova.test)
amova.test
## Not run:
# You can get the same results with the pegas implementation
amova.pegas <- poppr.amova(agc, ~Pop/Subpop, method = "pegas")
amova.pegas
amova.pegas$varcomp/sum(amova.pegas$varcomp)
# Clone correction is possible
amova.cc.result <- poppr.amova(agc, ~Pop/Subpop, clonecorrect = TRUE)
amova.cc.result
amova.cc.test <- randtest(amova.cc.result)
plot(amova.cc.test)
amova.cc.test
# Example with filtering
data(monpop)
splitStrata(monpop) <- ~Tree/Year/Symptom
poppr.amova(monpop, ~Symptom/Year) # gets a warning of zero distances
poppr.amova(monpop, ~Symptom/Year, filter = TRUE, threshold = 0.1) # no warning
## End(Not run)
|
Loading required package: adegenet
Loading required package: ade4
/// adegenet 2.0.1 is loaded ////////////
> overview: '?adegenet'
> tutorials/doc/questions: 'adegenetWeb()'
> bug reports/feature requests: adegenetIssues()
This is poppr version 2.5.0. To get started, type package?poppr
OMP parallel support: available
This is a genclone object
-------------------------
Genotype information:
119 original multilocus genotypes
187 diploid individuals
56 dominant loci
Population information:
2 strata - Pop, Subpop
2 populations defined - Athena, Mt. Vernon
No missing values detected.
$call
ade4::amova(samples = xtab, distances = xdist, structures = xstruct)
$results
Df Sum Sq Mean Sq
Between Pop 1 1051.2345 1051.234516
Between samples Within Pop 16 273.4575 17.091091
Within samples 169 576.5059 3.411277
Total 186 1901.1979 10.221494
$componentsofcovariance
Sigma %
Variations Between Pop 11.063446 70.006786
Variations Between samples Within Pop 1.328667 8.407483
Variations Within samples 3.411277 21.585732
Total variations 15.803391 100.000000
$statphi
Phi
Phi-samples-total 0.7841427
Phi-samples-Pop 0.2803128
Phi-Pop-total 0.7000679
class: krandtest lightkrandtest
Monte-Carlo tests
Call: randtest.amova(xtest = amova.result)
Number of tests: 3
Adjustment method for multiple comparisons: none
Permutation number: 99
Test Obs Std.Obs Alter Pvalue
1 Variations within samples 3.411277 -31.60269 less 0.01
2 Variations between samples 1.328667 18.43717 greater 0.01
3 Variations between Pop 11.063446 13.20472 greater 0.01
No missing values detected.
Warning message:
In is.euclid(xdist) : Zero distance(s)
Analysis of Molecular Variance
Call: pegas::amova(formula = hier, data = hierdf, nperm = nperm, is.squared = FALSE)
SSD MSD df
Pop 1051.2345 1051.234516 1
Subpop 273.4575 17.091091 16
Error 576.5059 3.411277 169
Total 1901.1979 10.221494 186
Variance components:
Pop Subpop Error
11.063446 1.328667 3.411277
Variance coefficients:
a b c
10.29589 11.16949 93.36898
Pop Subpop Error
0.70006786 0.08407483 0.21585732
No missing values detected.
$call
ade4::amova(samples = xtab, distances = xdist, structures = xstruct)
$results
Df Sum Sq Mean Sq
Between Pop 1 741.9872 741.987234
Between samples Within Pop 16 185.6877 11.605483
Within samples 123 520.1123 4.228555
Total 140 1447.7872 10.341337
$componentsofcovariance
Sigma %
Variations Between Pop 10.4131525 66.777680
Variations Between samples Within Pop 0.9520545 6.105355
Variations Within samples 4.2285550 27.116965
Total variations 15.5937620 100.000000
$statphi
Phi
Phi-samples-total 0.7288303
Phi-samples-Pop 0.1837727
Phi-Pop-total 0.6677768
class: krandtest lightkrandtest
Monte-Carlo tests
Call: randtest.amova(xtest = amova.cc.result)
Number of tests: 3
Adjustment method for multiple comparisons: none
Permutation number: 99
Test Obs Std.Obs Alter Pvalue
1 Variations within samples 4.2285550 -22.149571 less 0.01
2 Variations between samples 0.9520545 9.136886 greater 0.01
3 Variations between Pop 10.4131525 10.094187 greater 0.01
No loci with missing values above 5% found.
Distance matrix is non-euclidean.
Utilizing quasieuclid correction method. See ?quasieuclid for details.
$call
ade4::amova(samples = xtab, distances = xdist, structures = xstruct)
$results
Df Sum Sq Mean Sq
Between Symptom 1 8.751823 8.751823
Between samples Within Symptom 4 220.720591 55.180148
Within samples 688 2414.685005 3.509717
Total 693 2644.157419 3.815523
$componentsofcovariance
Sigma %
Variations Between Symptom -0.1678446 -4.383251
Variations Between samples Within Symptom 0.4873550 12.727243
Variations Within samples 3.5097166 91.656008
Total variations 3.8292270 100.000000
$statphi
Phi
Phi-samples-total 0.08343992
Phi-samples-Symptom 0.12192802
Phi-Symptom-total -0.04383251
Warning messages:
1: In is.euclid(xdist) : Zero distance(s)
2: In is.euclid(distmat) : Zero distance(s)
Filtering ...
Original multilocus genotypes ... 264
Contracted multilocus genotypes ... 236
No loci with missing values above 5% found.
Distance matrix is non-euclidean.
Utilizing quasieuclid correction method. See ?quasieuclid for details.
$call
ade4::amova(samples = xtab, distances = xdist, structures = xstruct)
$results
Df Sum Sq Mean Sq
Between Symptom 1 8.793784 8.793784
Between samples Within Symptom 4 221.212012 55.303003
Within samples 688 2422.331271 3.520830
Total 693 2652.337066 3.827326
$componentsofcovariance
Sigma %
Variations Between Symptom -0.1681045 -4.376428
Variations Between samples Within Symptom 0.4884090 12.715226
Variations Within samples 3.5208303 91.661202
Total variations 3.8411348 100.000000
$statphi
Phi
Phi-samples-total 0.08338798
Phi-samples-Symptom 0.12182086
Phi-Symptom-total -0.04376428
Warning message:
system call failed: Cannot allocate memory
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.