calc_fst: FST from SNP data.
In hemstrow/snpR: Whole-Genome Analysis Tools for Use with Single Nucleotide Polymorphism Data

calc_fst

R Documentation

FST from SNP data.

Description

calc_pairwise_fst calculates pairwise FST for each SNP for each possible pairwise combination of populations. calc_global_fst calculates FST for each facet globally across all subfacet levels.

Usage

calc_pairwise_fst(
  x,
  facets,
  method = "wc",
  boot = FALSE,
  boot_par = FALSE,
  zfst = FALSE,
  fst_over_one_minus_fst = FALSE,
  keep_components = FALSE,
  cleanup = TRUE,
  verbose = FALSE
)

calc_global_fst(
  x,
  facets,
  boot = FALSE,
  boot_par = FALSE,
  zfst = FALSE,
  fst_over_one_minus_fst = FALSE,
  keep_components = FALSE,
  verbose = FALSE
)

Arguments

`x`	snpRdata. Input SNP data.
`facets`	character. Categorical metadata variables by which to break up analysis. See `Facets_in_snpR` for more details.
`method`	character, default "wc". Defines the FST estimator to use. Options: wc: Weir and Cockerham (1984). genepop: Rousset (2008), uses the genepop package.
`boot`	numeric or FALSE, default FALSE. The number of bootstraps to do. See details.
`boot_par`	numeric or FALSE, default FALSE. If a number, bootstraps will be processed in parallel using the supplied number of cores.
`zfst`	logical, default FALSE. If TRUE, z-distributed Fst scores (zFST) will be calculated, equal to (fst - mean(fst))/sd(fst) within each group. The resulting values will be in the column "zfst", accessible using the usual `get.snpR.stats` method.
`fst_over_one_minus_fst`	logical, default FALSE. If TRUE, fst/(1-fst) will be calculated, and will be in the column "fst_id" accessible using the usual `get.snpR.stats` method.
`keep_components`	logical, default FALSE. If TRUE, the variance components "a", "b", and "c" will be held and accessible from the `$pairwise` element (named "var_comp_a", "var_comp_b", and "var_comp_c", respectively) using the usual `get.snpR.stats` method. This may be useful if working with very large datasets that need to be run with separate objects for each chromosome, etc. for memory purposes. Weighted averages can be generated identically to those from snpR by taking the weighted mean (via the `weighted.mean`) of "a" divided by the sum of the weighted means of "a", "b", and "c" using the number of SNPs called in a comparison (returned in the "nk" column from `get.snpR.stats`) as weights within each population comparison. Note that this is different than taking the weighted mean of a/(a + b + c)!
`cleanup`	logical, default TRUE. If TRUE, any new files created during FST calculation will be automatically removed.
`verbose`	Logical, default FALSE. If TRUE, some progress updates will be reported.

Details

Calculates FST according to either Weir and Cockerham 1984 or using the Fst function from the genepop package (see references). Genepop is not supported for global FST.

If the genepop option is used, several intermediate files will be created in the default temporary directory (see tempfile).

The Weir and Cockerham (1984) and genepop methods tend to produce very similar results both per SNP and per population. Generally, the former option may be preferred for computational efficiency.

P-values for group level comparisons can be calculated via bootstrapping using the boot option. Bootstraps are performed via randomly mixing individuals amongst different levels of the supplied facet, and thus the null hypothesis is that all groups are panmictic. P-values are calculated according to randtest, although that function is not directly called.

The data can be broken up categorically by either SNP and/or sample metadata, as described in Facets_in_snpR. Since this is a pairwise statistic, at least a single sample level facet must be provided.

Method Options:

"wc": Weir and Cockerham 1984.
"Genepop": As used in genepop, Rousset 2008.

Value

A snpRdata object with pairwise FST as well as the number of total observations at each SNP in each comparison merged in to the pairwise.stats slot.

Functions

calc_pairwise_fst(): Calculate FST across each pair of pairwise subfacet comparisons.
calc_global_fst(): Calculate FST globally across all subfacet levels.

Author(s)

William Hemstrom

References

Weir and Cockerham (1984). Evolution

Weir (1990). Genetic data analysis. Sinauer, Sunderland, MA

Rousset (2008). Molecular Ecology Resources

Examples

# Using Weir and Cockerham 1984's method
x <- calc_pairwise_fst(stickSNPs, "pop")
get.snpR.stats(x, "pop", "fst")

## Not run: 
# Using genepop
x <- calc_pairwise_fst(stickSNPs, "pop", "genepop")
get.snpR.stats(x, "pop", "fst")

# bootstrap p-values for overall pairwise-Fst values
x <- calc_pairwise_fst(stickSNPs, "pop", boot = 5)
get.snpR.stats(x, "pop", "fst")

## End(Not run)

hemstrow/snpR documentation built on July 5, 2025, 4:38 a.m.