geno_freq_phased: Calculate phased genotype frequencies from allele...

View source: R/geno_freq_phased.R

geno_freq_phasedR Documentation

Calculate phased genotype frequencies from allele frequencies, assuming Hardy-Weinberg equilibrium

Description

A function to calculate the population frequencies of the phased genotypes at a single autosomal genetic locus that has given allele frequencies and is at Hardy-Weinberg equilibrium. Phased genotypes can be used to investigate parent-of-origin effects, e.g. see (van Vliet et al., 2011).

Usage

geno_freq_phased(p_alleles, annotate = FALSE)

Arguments

p_alleles

A vector of strictly positive numbers that sum to 1, with p_alleles[i] interpreted as the allele frequency of the ith allele of the genetic locus. When annotate is TRUE, the names of the alleles will be taken to be names(p_alleles) or, if names(p_alleles) is NULL, to be 1:length(p_alleles).

annotate

A logical flag. When FALSE (the default), the function returns a vector suitable to be used as the geno_freq argument of pedigree_loglikelihood. When TRUE, the function adds a names attribute to this vector to indicate which element corresponds to which phased genotype.

Details

For a genetic locus that is at Hardy-Weinberg equilibrium in a particular population, the population allele frequencies at the locus determine the population genotype frequencies; see Sections 1.2 and 1.3 of (Lange, 2002) for the unphased version of this law. When a genetic locus is at Hardy-Weinberg equilibrium, the maternal and paternal alleles of a random person from the population are independent. A phased genotype at a genetic locus is an ordered pair consisting of a maternal and paternal allele at the locus. So to any heterozygous unphased genotype, there are two corresponding phased genotypes, and these two phased genotypes have equal frequencies under Hardy-Weinberg equilibrium.

Given a vector p_alleles containing the allele frequencies, this function returns the frequencies of the possible phased genotypes, in a particular order that can be viewed by setting annotate to TRUE. If the alleles are named 1:length(p_alleles), so that p_alleles[i] is the frequency of allele i, then the phased genotypes are of the form 1|1, 1|2, ..., where a|b means the maternal allele is a and the paternal allele is b. Note that if the output of this function is to be used as the geno_freq argument of pedigree_loglikelihood then the annotate option must be set to FALSE.

Value

A vector of strictly positive numbers (the genotype frequencies) that sum to 1, named with the genotype names if annotate is TRUE.

References

Lange K. Mathematical and Statistical Methods for Genetic Analysis (second edition). Springer, New York. 2002.

van Vliet CM, Dowty JG, van Vliet JL, et al. Dependence of colorectal cancer risk on the parent-of-origin of mutations in DNA mismatch repair genes. Hum Mutat. 2011;32(2):207-212.

Examples

# Genotype frequencies for a biallelic locus at Hardy-Weinberg equilibrium
# and with a minor allele frequency of 10%
p_alleles <- c(0.9, 0.1)
geno_freq_phased(p_alleles, annotate = TRUE)

# Genotype frequencies for a triallelic locus at Hardy-Weinberg equilibrium
p_alleles <- c(0.85, 0.1, 0.05)
geno_freq_phased(p_alleles, annotate = TRUE)
sum(geno_freq_phased(p_alleles))


clipp documentation built on July 12, 2022, 9:05 a.m.