simupopD: Simulate multi-population allele frequencies for independent... In forensim: Statistical tools for the interpretation of forensic DNA mixtures

Description

Simulate multi-population allele frequencies for independent loci, from a given reference population, following a Dirichlet model. Allele frequencies in the populations are generated as random deviates from a Dirichlet distribution, whose parameters control the deviation of allele frequencies from the values in the reference population.

Usage

 1 2 simupopD(npop = 1, nloc = 1, na = 2, globalfreq = NULL, which.loc = NULL, alpha1, alpha2 = 1)

Arguments

 npop the number of populations nloc the number of loci na an integer vector giving the numbers of alleles per locus globalfreq matrix of allele frequencies in the reference population. Data must be given in the format of the Journal of Forensic Sciences for genetic data. Default corresponds to allele frequencies generated form a Dirichlet distribution with parameter alpha2 for all allele frequencies. which.loc which loci to simulate from the globalfreq matrix, default considers all loci alpha1 a positive float vector of length npop giving the variance parameter of the Dirichlet distribution used to generate allele frequencies in the npop independent populations alpha2 a positive float giving the parameter to be used to in the Dirichlet distribution to generate allele frequencies for the reference population

Details

In the reference population, allele frequencies for independent loci are simulated using a Dirichlet distribution with parameter alpha2.
At a given locus L with n alleles, the allele frequencies are modeled as a vector of random variables p=(p1, ..., pn) following a Dirichlet distribution with a parameter vector of length n, where each component is equal to alpha2, p1+...+pn=1 and alpha2 > 0.
Note that a more sophisticated generation of global allele frequencies is possible using the simufreqD function. Similarly, allele frequencies in the independent populations are simulated using a Dirichlet Distribution. For example, for the first population to simulate, at a given locus L with n alleles, the allele frequencies are modeled as a vector of random variables p=(p1, ..., pn) following a Dirichlet distribution with a parameter vector of length n:
(p1(1-a1)/alpha1, ..., pn(1-alpha1)/alpha1), where p1+...+pn=1 and alpha1 > 0.
alpha1 is the variance parameter for population 1 and is equivalent to Wright's Fst. The closest this parameter is to one, the more the population allele frequencies are different from the values of the reference population.

Value

The result is stored in a list with two elements :

 globfreq a tabfreq object giving the allele frequencies of the chosen reference population, with the chosen loci. popfreq a tabfreq object giving the allele frequencies of the simulated populations.

Note

The code used here for the generation of random Dirichlet deviates was previously implemented in the gtools library.

Author(s)

Hinda Haned h.haned@nfi.minvenj.nl

References

Nicholson G, Smith AV, Jonsson F, Gustafsson O, Stefansson K, Donnelly P. Assessing population differentiation and isolation from single-nucleotide polymorphism data. J Roy Stat Soc B 2002;64:695–715

Marchini J, Cardon LR. Discussion on the meeting on "Statistical modelling and analysis of genetic data" J Roy Stat Soc B, 2002;64:740-741

Wright S. The genetical structure of populations. Ann Eugen 1951;15:323-354