simufreqD: Function to simulate allele frequencies for independent loci... In forensim: Statistical tools for the interpretation of forensic DNA mixtures

Description

The simufreqD function simulate single population allele frequencies for independent loci. Allele frequencies are generated as random deviates from a Dirichlet distribution, whose parameters control the mean and the variance of the simulated allele frequencies.

Usage

 1 simufreqD(nloc = 1, nal = 2, alpha = 1)

Arguments

 nloc the number of loci to simulate nal the numbers of alleles per locus. Either an integer, if the loci have the same number of alleles, or an integer vector, if the number of alleles differ between loci alpha the parameter used to simulate allele frequencies from the Dirichlet distribution. If the nloc loci have the same allele number, alpha can either be the same for all alleles (default is one: uniform distribution), in this case alpha is an integer, or alpha can be different between alleles at a given locus, in this case, alpha is a matrix of dimension nal x nloc. When the number of alleles differ between loci, alpha can either be the same or differ between alleles at a given locus. In the first case alpha is a vector of length nloc, in the second case, alpha is a matrix of dimensions nal x nloc where NAs are introduced for alleles not seen at a given locus.

Details

Allele frequencies for independent loci are simulated using a Dirichlet distribution with parameter alpha. At a given locus L with n alleles, the allele frequencies are modeled as a vector of random variables p=(p1, ..., pn), following a Dirichlet distribution with parameters:
alpha = (alpha1, ..., alphan) where p1+...+pn=1 and alpha1,..., alphan > 0.

Value

A matrix containing the simulated allele frequencies. The data is presented in the format of the Journal of Forensic Sciences for genetic data: allele names are given in the first column, and frequencies for a given allele are read in rows for the different markers in columns. When an allele is not observed for a given locus, the value is coded NA (instead of "-" in the original format).

Note

The code used here for the generation of random Dirichlet deviates was previously implemented in the gtools library.

Author(s)

Hinda Haned [email protected]

References

Johnson NL, Kotz S, Balakrishnan N. Continuous Univariate Distributions, vol 2. John Wiley & Sons, 1995.

Wright S. The genetical structure of populations. Ann Eugen 1951;15:323-354.