# Correcting rare allele frequencies

### Description

The following is a set of arguments for use in `rraf`

,
`pgen`

, and `psex`

to correct rare allele frequencies
that were lost in estimating round-robin allele frequencies.

### Arguments

`e` |
a numeric epsilon value to use for all missing allele frequencies. |

`d` |
the unit by which to take the reciprocal. |

`mul` |
a multiplier for div. Default is |

`sum_to_one` |
when |

### Details

By default (```
d = "sample", e = NULL, sum_to_one = FALSE, mul =
1
```

), this will add 1/(n samples) to all zero-value alleles. The basic formula
is **1/(d * m)** unless **e** is specified. If ```
sum_to_one =
TRUE
```

, then the frequencies will be scaled as x/sum(x) AFTER correction,
indicating that the allele frequencies will be reduced. See the examples for
details. The general pattern of correction is that the value of the MAF will
be *rrmlg > mlg > sample*

### Motivation

When calculating allele frequencies from a round-robin
approach, rare alleles are often lost resulting in zero-valued allele
frequencies (Arnaud-Haond et al. 2007, Parks and Werth 1993). This can be
problematic when calculating values for `pgen`

and
`psex`

because frequencies of zero will result in undefined
values for samples that contain those rare alleles. The solution to this
problem is to give an estimate for the frequency of those rare alleles, but
the question of HOW to do that arises. These arguments provide a way to
define how rare alleles are to be estimated/corrected.

### Using these arguments

These arguments are for use in the functions `rraf`

,
`pgen`

, and `psex`

. They will replace the dots (...)
that appear at the end of the function call. For example, if you want to set
the minor allele frequencies to a specific value (let's say 0.001),
regardless of locus, you can insert `e = 0.001`

along with any other
arguments (note, position is not specific):

1 2 |

### Author(s)

Zhian N. Kamvar

### References

Arnaud-Haond, S., Duarte, C. M., Alberto, F., & SerrĂ£o, E. A. 2007.
Standardizing methods to address clonality in population studies.
*Molecular Ecology*, 16(24), 5115-5139.

Parks, J. C., & Werth, C. R. 1993. A study of spatial features of clones in a
population of bracken fern, *Pteridium aquilinum* (Dennstaedtiaceae).
*American Journal of Botany*, 537-544.

### See Also

`rraf`

,
`pgen`

,
`psex`

,
`rrmlg`

### Examples

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 | ```
## Not run:
data(Pram)
#-------------------------------------
# If you set correction = FALSE, you'll notice the zero-valued alleles
rraf(Pram, correction = FALSE)
# By default, however, the data will be corrected by 1/n
rraf(Pram)
# Of course, this is a diploid organism, we might want to set 1/2n
rraf(Pram, mul = 1/2)
# To set MAF = 1/2mlg
rraf(Pram, d = "mlg", mul = 1/2)
# Another way to think about this is, since these allele frequencies were
# derived at each locus with different sample sizes, it's only appropriate to
# correct based on those sample sizes.
rraf(Pram, d = "rrmlg", mul = 1/2)
# If we were going to use these frequencies for simulations, we might want to
# ensure that they all sum to one.
rraf(Pram, d = "mlg", mul = 1/2, sum_to_one = TRUE)
#-------------------------------------
# When we calculate these frequencies based on population, they are heavily
# influenced by the number of observed mlgs.
rraf(Pram, by_pop = TRUE, d = "rrmlg", mul = 1/2)
# This can be fixed by specifying a specific value
rraf(Pram, by_pop = TRUE, e = 0.01)
## End(Not run)
``` |