pop_het_obs: Compute the population observed heterozygosity

View source: R/pop_het_obs.R

pop_het_obsR Documentation

Compute the population observed heterozygosity

Description

This function computes population heterozygosity, using the formula of Nei (1987).

Usage

pop_het_obs(
  .x,
  by_locus = FALSE,
  include_global = FALSE,
  n_cores = bigstatsr::nb_cores()
)

Arguments

.x

a gen_tibble (usually grouped, as obtained by using dplyr::group_by(), otherwise the full tibble will be considered as belonging to a single population).

by_locus

boolean, determining whether Ho should be returned by locus(TRUE), or as a single genome wide value (FALSE, the default).

include_global

boolean determining whether, besides the population specific estimates, a global estimate should be appended. Note that this will return a vector of n populations plus 1 (the global value), or a matrix with n+1 columns if by_locus=TRUE.

n_cores

number of cores to be used, it defaults to bigstatsr::nb_cores()

Details

Within population observed heterozygosity \hat{h}_o for a locus with m alleles is defined as:
\hat{h}_o= 1-\sum_{k=1}^{s} \sum_{i=1}^{m} \hat{X}_{kii}/s
where
\hat{X}_{kii} represents the proportion of homozygote i in the sample for the kth population and
s the number of populations,
following equation 7.38 in Nei(1987) on pp.164. In our specific case, there are only two alleles, so m=2. For population specific estimates, the sum is done over a single value of k. \hat{h}_o at the genome level is simply the mean of the locus estimates.

Value

a vector of mean population observed heterozygosities (if by_locus=FALSE), or a matrix of estimates by locus (rows are loci, columns are populations, by_locus=TRUE)

References

Nei M. (1987) Molecular Evolutionary Genetics. Columbia University Press

Examples



example_gt <- load_example_gt("grouped_gen_tbl")

# Compute expected heterozygosity
example_gt %>% pop_het_obs()

# To include the global expected heterozygosity, set include_global = TRUE
example_gt %>% pop_het_obs(include_global = TRUE)

# To return by locus, set by_locus = TRUE
example_gt %>% pop_het_obs(by_locus = TRUE)


tidypopgen documentation built on Aug. 28, 2025, 1:08 a.m.