Calculate the probability of genotypes based on the product of allele frequencies over all loci.
a genind or genclone object.
either a formula to set the population factor from the
When this is
a vector or matrix of allele frequencies. This defaults to
options from correcting rare alleles. The default is to correct allele frequencies to 1/n
Pgen is the probability of a given genotype occuring in a population assuming HWE. Thus, the value for diploids is
pgen = prod(p_i)*(2^h)
where p_i are the allele frequencies and h is the count of the number of heterozygous sites in the sample (Arnaud-Haond et al. 2007; Parks and Werth, 1993). The allele frequencies, by default, are calculated using a round-robin approach where allele frequencies at a particular locus are calculated on the clone-censored genotypes without that locus.
To avoid issues with numerical precision of small numbers, this function calculates pgen per locus by adding up log-transformed values of allele frequencies. These can easily be transformed to return the true value (see examples).
A vector containing Pgen values per locus for each genotype in the object.
For haploids, Pgen at a particular locus is the allele frequency. This
function cannot handle polyploids. Additionally, when the argument
pop is not
by_pop is automatically
Zhian N. Kamvar, Jonah Brooks, Stacy A. Krueger-Hadfield, Erik Sotka
Arnaud-Haond, S., Duarte, C. M., Alberto, F., & Serrão, E. A. 2007. Standardizing methods to address clonality in population studies. Molecular Ecology, 16(24), 5115-5139.
Parks, J. C., & Werth, C. R. 1993. A study of spatial features of clones in a population of bracken fern, Pteridium aquilinum (Dennstaedtiaceae). American Journal of Botany, 537-544.
correcting rare alleles
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37
data(Pram) head(pgen(Pram, log = FALSE)) ## Not run: # You can also supply the observed allele frequencies pramfreq <- Pram %>% genind2genpop() %>% tab(freq = TRUE) head(pgen(Pram, log = FALSE, freq = pramfreq)) # You can get the Pgen values over all loci by summing over the logged results: pgen(Pram, log = TRUE) %>% # calculate pgen matrix rowSums(na.rm = TRUE) %>% # take the sum of each row exp() # take the exponent of the results # You can also take the product of the non-logged results: apply(pgen(Pram, log = FALSE), 1, prod, na.rm = TRUE) ## Rare Allele Correction --------------------------------------------------- ## # If you don't supply a table of frequencies, they are calculated with rraf # with correction = TRUE. This is normally benign when analyzing large # populations, but it can have a great effect on small populations. To help # control this, you can supply arguments described in # help("rare_allele_correction"). # Default is to correct by 1/n per population. Since the calculation is # performed on a smaller sample size due to round robin clone correction, it # would be more appropriate to correct by 1/rrmlg at each locus. This is # acheived by setting d = "rrmlg". Since this is a diploid, we would want to # account for the number of chromosomes, and so we set mul = 1/2 head(pgen(Pram, log = FALSE, d = "rrmlg", mul = 1/2)) # compare with the output above # If you wanted to treat all alleles as equally rare, then you would set a # specific value (let's say the rare alleles are 1/100): head(pgen(Pram, log = FALSE, e = 1/100)) ## End(Not run)