calc_seg_sites: Calculate the number of segregating sites.

calc_seg_sitesR Documentation

Calculate the number of segregating sites.

Description

Calculates the number segregating loci for each facet level with or without rarefaction to account for differing sample size and missing data levels across populations.

Usage

calc_seg_sites(x, facets = NULL, rarefaction = TRUE, g = 0)

Arguments

x

snpRdata object

facets

facets over which to calculate the number of segregating sites. See Facets_in_snpR for details.

rarefaction

logical, default TRUE. Should the number of segregating sites be estimated via rarefaction? See details.

g

numeric, default 0. If doing rarefaction, controls the number of genotypes to rarefact to. If 0, this will rarefact to the smallest sample size per locus. If g < 0, this will rarefact to to the smallest sample size per locus minus the absolute value of g. If positive, this will rarefact to g, and any loci where the smallest sample size is less than g will be dropped from the calculation.

Details

Rarefaction is done by determining the probability of drawing at least one copy of each allele given a standardized sample size, g, given the observed genotype frequencies at each locus. Using the observed genotype frequencies ensures that this is not biased by deviations from Hardy-Weinburg equilibrium. This is then summed across all loci to get the expected number of segregating sites for each population/subfacet.

Note that g will vary across loci due to differences in sequencing coverage at those loci, equal to the smallest number of genotypes sequenced in any population at that locus minus one.

Note no sample-specific facet is requested, rarefaction will not be used.

Value

A snpRdata object with seg_sites merged into the weighted.means slot.

Author(s)

William Hemstrom

Examples

# base facet
x <- calc_seg_sites(stickSNPs, rarefaction = FALSE)
get.snpR.stats(x, stats = "seg_sites")$weighted.means

# multiple facets
x <- calc_seg_sites(stickSNPs, c("pop", "fam"))
get.snpR.stats(x, c("pop", "fam"), 
               stats = "seg_sites")$weighted.means

hemstrow/snpR documentation built on March 20, 2024, 7:03 a.m.