Correcting rare allele frequencies
The following is a set of arguments for use in
psex to correct rare allele frequencies
that were lost in estimating round-robin allele frequencies.
a numeric epsilon value to use for all missing allele frequencies.
the unit by which to take the reciprocal.
a multiplier for div. Default is
By default (
d = "sample", e = NULL, sum_to_one = FALSE, mul =
1), this will add 1/(n samples) to all zero-value alleles. The basic formula
is 1/(d * m) unless e is specified. If
TRUE, then the frequencies will be scaled as x/sum(x) AFTER correction,
indicating that the allele frequencies will be reduced. See the examples for
details. The general pattern of correction is that the value of the MAF will
be rrmlg > mlg > sample
When calculating allele frequencies from a round-robin
approach, rare alleles are often lost resulting in zero-valued allele
frequencies (Arnaud-Haond et al. 2007, Parks and Werth 1993). This can be
problematic when calculating values for
psex because frequencies of zero will result in undefined
values for samples that contain those rare alleles. The solution to this
problem is to give an estimate for the frequency of those rare alleles, but
the question of HOW to do that arises. These arguments provide a way to
define how rare alleles are to be estimated/corrected.
Using these arguments
These arguments are for use in the functions
psex. They will replace the dots (...)
that appear at the end of the function call. For example, if you want to set
the minor allele frequencies to a specific value (let's say 0.001),
regardless of locus, you can insert
e = 0.001 along with any other
arguments (note, position is not specific):
Zhian N. Kamvar
Arnaud-Haond, S., Duarte, C. M., Alberto, F., & Serrão, E. A. 2007. Standardizing methods to address clonality in population studies. Molecular Ecology, 16(24), 5115-5139.
Parks, J. C., & Werth, C. R. 1993. A study of spatial features of clones in a population of bracken fern, Pteridium aquilinum (Dennstaedtiaceae). American Journal of Botany, 537-544.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44
## Not run: data(Pram) #------------------------------------- # If you set correction = FALSE, you'll notice the zero-valued alleles rraf(Pram, correction = FALSE) # By default, however, the data will be corrected by 1/n rraf(Pram) # Of course, this is a diploid organism, we might want to set 1/2n rraf(Pram, mul = 1/2) # To set MAF = 1/2mlg rraf(Pram, d = "mlg", mul = 1/2) # Another way to think about this is, since these allele frequencies were # derived at each locus with different sample sizes, it's only appropriate to # correct based on those sample sizes. rraf(Pram, d = "rrmlg", mul = 1/2) # If we were going to use these frequencies for simulations, we might want to # ensure that they all sum to one. rraf(Pram, d = "mlg", mul = 1/2, sum_to_one = TRUE) #------------------------------------- # When we calculate these frequencies based on population, they are heavily # influenced by the number of observed mlgs. rraf(Pram, by_pop = TRUE, d = "rrmlg", mul = 1/2) # This can be fixed by specifying a specific value rraf(Pram, by_pop = TRUE, e = 0.01) ## End(Not run)
Want to suggest features or report bugs for rdrr.io? Use the GitHub issue tracker.