calc_fst: FST from SNP data.

calc_fstR Documentation

FST from SNP data.

Description

calc_pairwise_fst calculates pairwise FST for each SNP for each possible pairwise combination of populations. calc_global_fst calculates FST for each facet globally across all subfacet levels.

Usage

calc_pairwise_fst(
  x,
  facets,
  method = "wc",
  boot = FALSE,
  boot_par = FALSE,
  zfst = FALSE,
  fst_over_one_minus_fst = FALSE,
  keep_components = FALSE,
  cleanup = TRUE,
  verbose = FALSE
)

calc_global_fst(
  x,
  facets,
  boot = FALSE,
  boot_par = FALSE,
  zfst = FALSE,
  fst_over_one_minus_fst = FALSE,
  keep_components = FALSE,
  verbose = FALSE
)

Arguments

x

snpRdata. Input SNP data.

facets

character. Categorical metadata variables by which to break up analysis. See Facets_in_snpR for more details.

method

character, default "wc". Defines the FST estimator to use. Options:

  • wc: Weir and Cockerham (1984).

  • genepop: Rousset (2008), uses the genepop package.

boot

numeric or FALSE, default FALSE. The number of bootstraps to do. See details.

boot_par

numeric or FALSE, default FALSE. If a number, bootstraps will be processed in parallel using the supplied number of cores.

zfst

logical, default FALSE. If TRUE, z-distributed Fst scores (zFST) will be calculated, equal to (fst - mean(fst))/sd(fst) within each group. The resulting values will be in the column "zfst", accessible using the usual get.snpR.stats method.

fst_over_one_minus_fst

logical, default FALSE. If TRUE, fst/(1-fst) will be calculated, and will be in the column "fst_id" accessible using the usual get.snpR.stats method.

keep_components

logical, default FALSE. If TRUE, the variance components "a", "b", and "c" will be held and accessible from the $pairwise element (named "var_comp_a", "var_comp_b", and "var_comp_c", respectively) using the usual get.snpR.stats method. This may be useful if working with very large datasets that need to be run with separate objects for each chromosome, etc. for memory purposes. Weighted averages can be generated identically to those from snpR by taking the weighted mean (via the weighted.mean) of "a" divided by the sum of the weighted means of "a", "b", and "c" using the number of SNPs called in a comparison (returned in the "nk" column from get.snpR.stats) as weights within each population comparison. Note that this is different than taking the weighted mean of a/(a + b + c)!

cleanup

logical, default TRUE. If TRUE, any new files created during FST calculation will be automatically removed.

verbose

Logical, default FALSE. If TRUE, some progress updates will be reported.

Details

Calculates FST according to either Weir and Cockerham 1984 or using the Fst function from the genepop package (see references). Genepop is not supported for global FST.

If the genepop option is used, several intermediate files will be created in the default temporary directory (see tempfile).

The Weir and Cockerham (1984) and genepop methods tend to produce very similar results both per SNP and per population. Generally, the former option may be preferred for computational efficiency.

P-values for group level comparisons can be calculated via bootstrapping using the boot option. Bootstraps are performed via randomly mixing individuals amongst different levels of the supplied facet, and thus the null hypothesis is that all groups are panmictic. P-values are calculated according to randtest, although that function is not directly called.

The data can be broken up categorically by either SNP and/or sample metadata, as described in Facets_in_snpR. Since this is a pairwise statistic, at least a single sample level facet must be provided.

Method Options:

  • "wc": Weir and Cockerham 1984. item"Genepop": As used in genepop, Rousset 2008.

Value

A snpRdata object with pairwise FST as well as the number of total observations at each SNP in each comparison merged in to the pairwise.stats slot.

Functions

  • calc_pairwise_fst(): Calculate FST across each pair of pairwise subfacet comparisons.

  • calc_global_fst(): Calculate FST globally across all subfacet levels.

Author(s)

William Hemstrom

References

Weir and Cockerham (1984). Evolution

Weir (1990). Genetic data analysis. Sinauer, Sunderland, MA

Rousset (2008). Molecular Ecology Resources

Examples

# Using Weir and Cockerham 1984's method
x <- calc_pairwise_fst(stickSNPs, "pop")
get.snpR.stats(x, "pop", "fst")

## Not run: 
# Using genepop
x <- calc_pairwise_fst(stickSNPs, "pop", "genepop")
get.snpR.stats(x, "pop", "fst")

# bootstrap p-values for overall pairwise-Fst values
x <- calc_pairwise_fst(stickSNPs, "pop", boot = 5)
get.snpR.stats(x, "pop", "fst")

## End(Not run)

hemstrow/snpR documentation built on March 20, 2024, 7:03 a.m.