EBFST: Empirical Bayes estimator of Fst.

View source: R/EBFST.R

EBFSTR Documentation

Empirical Bayes estimator of Fst.

Description

This function estimates global/pairwise Fst among subpopulations using empirical Bayes method (Kitada et al. 2007, 2017). Preciseness of estimated pairwise Fst is evaluated by bootstrap method. This function accepts two types of data object, GENEPOP data (Rousset 2008) and allele (haplotype) frequency data (Kitada et al. 2007). Missing genotype values in the GENEPOP file ("0000" or "000000") are simply ignored.

Usage

EBFST(popdata, num.iter = 100, locus = F)

Arguments

popdata

Genotype data object of populations created by read.genepop function from a GENEPOP file. Allele (haplotype) frequency data object created by read.frequency function from a frequency format file also is acceptable.

num.iter

A positive integer value specifying the number of iterations in empirical Bayes simulation.

locus

A Logical argument indicating whether locus-specific Fst values should be calculated.

Details

Frequency format file is a plain text file containing allele (haplotype) count data. This format is mainly for mitochondrial DNA (mtDNA) haplotype frequency data, however nuclear DNA (nDNA) data also is applicable. In the data object created by read.frequency function, "number of samples" means haplotype count. Therefore, it equals the number of individuals in mtDNA data, however it is the twice of the number of individuals in nDNA data. First part of the frequency format file is the number of subpopulations, second part is the number of loci, and latter parts are [population x allele] matrices of the observed allele (haplotype) counts at each locus. Two examples of frequency format files are attached in this package. See jsmackerel.

Value

global:

theta

Estimated gene flow rate.

fst

Estimated genome-wide global Fst.

fst.locus

Estimated locus-specific global Fst. (If locus = TRUE)

pairwise:

fst

Estimated genome-wide pairwise Fst.

fst.boot

Bootstrap mean of estimated Fst.

fst.boot.sd

Bootstrap standard deviation of estimated Fst.

fst.locus

Estimated locus-specific pairwise Fst. (If locus = TRUE)

Author(s)

Reiichiro Nakamichi, Hirohisa Kishino, Shuichi Kitada

References

Kitada S, Kitakado T, Kishino H (2007) Empirical Bayes inference of pairwise FST and its distribution in the genome. Genetics, 177, 861-873.

Kitada S, Nakamichi R, Kishino H (2017) The empirical Bayes estimators of fine-scale population structure in high gene flow species. Mol. Ecol. Resources, DOI: 10.1111/1755-0998.12663

Rousset F (2008) Genepop'007: a complete reimplementation of the Genepop software for Windows and Linux. Mol. Ecol. Resources, 8, 103-106.

See Also

read.genepop, read.frequency, as.dist, as.dendrogram, hclust, cmdscale, nj

Examples

# Example of GENEPOP file
data(jsmackerel)
jsm.ms.genepop.file <- tempfile()
jsm.popname.file <- tempfile()
cat(jsmackerel$MS.genepop, file=jsm.ms.genepop.file, sep="\n")
cat(jsmackerel$popname, file=jsm.popname.file, sep=" ")

# Data load
# Prepare your GENEPOP file and population name file in the working directory
# Replace "jsm.ms.genepop.file" and "jsm.popname.file" by your file names.
popdata <- read.genepop(genepop=jsm.ms.genepop.file, popname=jsm.popname.file)

# Fst estimation
result.eb <- EBFST(popdata)
ebfst <- result.eb$pairwise$fst
ebfst.d <- as.dist(ebfst)
print(ebfst.d)

# dendrogram
ebfst.hc <- hclust(ebfst.d,method="average")
plot(as.dendrogram(ebfst.hc), xlab="",ylab="",main="", las=1)

# MDS plot
mds <- cmdscale(ebfst.d)
plot(mds, type="n", xlab="",ylab="")
text(mds[,1],mds[,2], popdata$pop_names)

# NJ tree
library(ape)
ebfst.nj <- nj(ebfst.d)
plot(ebfst.nj,type="u",main="",sub="")

FinePop documentation built on May 29, 2024, 4:28 a.m.