IterateHWE | R Documentation |
These are wrapper function that iteratively run other polyRAD functions until allele frequencies stabilize to within a user-defined threshold. Genotype posterior probabilities can then be exported for downstream analysis.
IterateHWE(object, selfing.rate = 0, tol = 1e-05, excludeTaxa = GetBlankTaxa(object), overdispersion = 9) IterateHWE_LD(object, selfing.rate = 0, tol = 1e-05, excludeTaxa = GetBlankTaxa(object), LDdist = 1e4, minLDcorr = 0.2, overdispersion = 9) IteratePopStruct(object, selfing.rate = 0, tol = 1e-03, excludeTaxa = GetBlankTaxa(object), nPcsInit = 10, minfreq = 0.0001, overdispersion = 9, maxR2changeratio = 0.05) IteratePopStructLD(object, selfing.rate = 0, tol = 1e-03, excludeTaxa = GetBlankTaxa(object), nPcsInit = 10, minfreq = 0.0001, LDdist = 1e4, minLDcorr = 0.2, overdispersion = 9, maxR2changeratio = 0.05)
object |
A |
selfing.rate |
A number ranging from zero to one indicating the frequency of self-fertilization in the species. For individuals with odd ploidy (e.g. triploids), the selfing rate is always treated as zero and a warning is printed if a value above zero is provided. |
tol |
A number indicating when the iteration should end. It indicates the maximum mean difference in allele frequencies between iterations that is tolerated. Larger numbers will lead to fewer iterations. |
excludeTaxa |
A character vector indicating names of taxa that should be excluded from allele frequency estimates and chi-squared estimates. |
nPcsInit |
An integer indicating the number of principal component axes to initially
estimate from |
minfreq |
A number indicating the minimum allele frequency allowed. Passed to
|
LDdist |
The distance, in basepairs, within which to search for alleles that may be in linkage disequilibrium with a given allele. |
minLDcorr |
The minimum correlation coefficient between two alleles
for linkage disequilibrium between those alleles to be used by the pipeline
for genotype estimation; see |
overdispersion |
Overdispersion parameter; see |
maxR2changeratio |
This number determines how many principal component axes are retained. The
difference in R-squared values between the first and second axes
is multiplied by |
For IterateHWE
, the following functions are run iteratively,
assuming no population structure:
AddAlleleFreqHWE
,
AddGenotypePriorProb_HWE
, AddGenotypeLikelihood
,
AddPloidyChiSq
, and AddGenotypePosteriorProb
.
IterateHWE_LD
runs each of the functions listed for IterateHWE
once, then runs AddAlleleLinkages
. It then runs
AddAlleleFreqHWE
, AddGenotypePriorProb_HWE
,
AddGenotypePriorProb_LD
, AddGenotypeLikelihood
,
AddPloidyChiSq
, and AddGenotypePosteriorProb
iteratively until allele frequencies converge.
For IteratePopStruct
, the following functions are run iteratively,
modeling population structure:
AddPCA
, AddAlleleFreqByTaxa
,
AddAlleleFreqHWE
, AddGenotypePriorProb_ByTaxa
,
AddGenotypeLikelihood
, AddPloidyChiSq
, and
AddGenotypePosteriorProb
.
After the first PCA analysis, the number of principal component axes is not
allowed to decrease, and can only increase by one from one round to the next,
in order to help the algorithm converge.
IteratePopStructLD
runs each of the functions listed for
IteratePopStruct
once, then runs AddAlleleLinkages
.
It then runs
AddAlleleFreqHWE
, AddGenotypePriorProb_ByTaxa
,
AddGenotypePriorProb_LD
,
AddGenotypeLikelihood
, AddPloidyChiSq
,
AddGenotypePosteriorProb
, AddPCA
,
and AddAlleleFreqByTaxa
iteratively until convergence of
allele frequencies.
A "RADdata"
object identical to that passed to the function, but with
$alleleFreq
, $priorProb
, $depthSamplingPermutations
,
$genotypeLikelihood
,
$ploidyChiSq
, and $posteriorProb
slots added.
For IteratePopStruct
and IteratePopStructLD
,
$alleleFreqByTaxa
and $PCA
are also added. For
IteratePopStructLD
and IterateHWE_LD
, $alleleLinkages
and $priorProbLD
are also added.
If you see the error
Error in if (rel_ch < threshold & count > 5) { :
missing value where TRUE/FALSE needed
try lowering nPcsInit
.
Lindsay V. Clark
GetWeightedMeanGenotypes
for outputting genotypes in a
useful format after iteration is completed.
StripDown
to remove memory-hogging slots that are no longer
needed after the pipeline has been run.
PipelineMapping2Parents
for mapping populations.
# load dataset data(exampleRAD) # iteratively estimate parameters exampleRAD <- IterateHWE(exampleRAD) # export results GetWeightedMeanGenotypes(exampleRAD) # re-load to run pipeline assuming population structure data(exampleRAD) # run pipeline exampleRAD <- IteratePopStruct(exampleRAD, nPcsInit = 3) # export results GetWeightedMeanGenotypes(exampleRAD) # dataset for LD pipeline data(Msi01genes) # run HWE + LD pipeline mydata1 <- IterateHWE_LD(Msi01genes) # run pop. struct + LD pipeline # (tolerance raised to make example run faster) mydata2 <- IteratePopStructLD(Msi01genes, tol = 0.01)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.