exp_and_obs_geno_freqs: Computed expected and observed genotype frequencies from a...

Description Usage Arguments Value Examples

View source: R/exp_and_obs_geno_freqs.R

Description

Under the assumption of Hardy-Weinberg equilibrium, this function uses the estimated allele frequencies from the data set in v to compute expected genotype frequencies, and then reports these along with the observed genotype frequencies. Loci come out named as CHROM–POS.

Usage

1
2
3
4
5
6
exp_and_obs_geno_freqs(
  v = NULL,
  d012 = NULL,
  prop_indv_required = 0.5,
  prop_loci_required = 0.5
)

Arguments

v

a ‘vcfR’ object. Exactly one of v or d012 is required. But you can't use both!

d012

an integer matrix (or a numeric matrix, which will be coerced to be of integer type) with individuals in columns, and markers in rows. 0 denotes a genotype homozygous for the reference allele, 1 is a heterozygote, 2 is a homozygote for the alternate allele, and -1 denotes missing data. This matrix is not required to have column (sample) names. They won't be used if they are present. But, the matrix must have rownames, which should be in the format of CHROM–POS (i.e. the "chromosome" name (or the "contig" name) followed by a "–" followed by the position of the marker in the "chromosome"). Exactly one of v or d012 is required. But you can't use both!

prop_indv_required

loci will be dropped if a proportion of individuals less than prop_indv_required have non-missing data at that locus. Default is 0.5

prop_loci_required

individual will be dropped if their proportion of non-missing loci is less than prop_loci_required. Default is 0.5.

Value

Returns a tibble with the following columns: snp = the locus name as CHROM–POS; p = The frequency of the alternate (ALT) allele; ntot = the total number of individuals with no missing data at the locus; geno = column telling which genotype (0, 1, or 2) is referred to; p_exp = expected frequency of the genotype; p_obs = observed frequency of genotype; n_exp = expected number of such genotypes; n_obs = observed number of such genotypes; z_score = simple statistic giving how far the observed genotype frequency is from that expected under Hardy-Weinberg equilibrium.

Examples

1
2
3
4
5
eao <- exp_and_obs_geno_freqs(v = lobster_buz_2000)

# if you wanted to run that on an 012 matrix,
# it would be like this:
eao012 <- exp_and_obs_geno_freqs(d012 = lobster_buz_2000_as_012_matrix)

whoa documentation built on Aug. 11, 2021, 9:06 a.m.