snpgdsFst: F-statistics (fixation indices)

Description Usage Arguments Details Value Author(s) References Examples

View source: R/IBD.R

Description

Calculate relatedness measures F-statistics (also known as fixation indices) for given populations

Usage

1
2
3
snpgdsFst(gdsobj, population, method=c("W&C84", "W&H02"), sample.id=NULL,
    snp.id=NULL, autosome.only=TRUE, remove.monosnp=TRUE, maf=NaN,
    missing.rate=NaN, with.id=FALSE, verbose=TRUE)

Arguments

gdsobj

an object of class SNPGDSFileClass, a SNP GDS file

population

a factor, indicating population information for each individual

method

"W&C84" – Fst estimator in Weir & Cockerham 1984 (by default), "W&H02" – relative beta estimator in Weir & Hill 2002, see details

sample.id

a vector of sample id specifying selected samples; if NULL, all samples are used

snp.id

a vector of snp id specifying selected SNPs; if NULL, all SNPs are used

autosome.only

if TRUE, use autosomal SNPs only; if it is a numeric or character value, keep SNPs according to the specified chromosome

remove.monosnp

if TRUE, remove monomorphic SNPs

maf

to use the SNPs with ">= maf" only; if NaN, no MAF threshold

missing.rate

to use the SNPs with "<= missing.rate" only; if NaN, no missing threshold

with.id

if TRUE, the returned value with sample.id and sample.id

verbose

if TRUE, show information

Details

The minor allele frequency and missing rate for each SNP passed in snp.id are calculated over all the samples in sample.id.

The "W&H02" option implements the calculation in Buckleton et. al. 2016.

Value

Return a list:

sample.id

the sample ids used in the analysis

snp.id

the SNP ids used in the analysis

Fst

weighted Fst estimate

MeanFst

the average of Fst estimates across SNPs

FstSNP

a vector of Fst for each SNP

Beta

Beta matrix

Author(s)

Xiuwen Zheng

References

Weir, BS. & Cockerham, CC. Estimating F-statistics for the analysis of population structure. (1984).

Weir, BS. & Hill, WG. Estimating F-statistics. Annual review of genetics 36, 721-50 (2002).

Population-specific FST values for forensic STR markers: A worldwide survey. Buckleton J, Curran J, Goudet J, Taylor D, Thiery A, Weir BS. Forensic Sci Int Genet. 2016 Jul;23:91-100. doi: 10.1016/j.fsigen.2016.03.004.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
# open an example dataset (HapMap)
genofile <- snpgdsOpen(snpgdsExampleFileName())

group <- as.factor(read.gdsn(index.gdsn(
    genofile, "sample.annot/pop.group")))

# Fst estimation
v <- snpgdsFst(genofile, population=group, method="W&C84")
v$Fst
v$MeanFst
summary(v$FstSNP)

# or
v <- snpgdsFst(genofile, population=group, method="W&H02")
v$Fst
v$MeanFst
v$Beta
summary(v$FstSNP)

# close the genotype file
snpgdsClose(genofile)

Example output

Loading required package: gdsfmt
SNPRelate -- supported by Streaming SIMD Extensions 2 (SSE2)
Fst estimation on genotypes:
Excluding 365 SNPs on non-autosomes
Excluding 1 SNP (monomorphic: TRUE, MAF: NaN, missing rate: NaN)
Working space: 279 samples, 8,722 SNPs
Method: Weir & Cockerham, 1984
# of Populations: 4
    CEU (92), HCB (47), JPT (47), YRI (93)
[1] 0.1443145
[1] 0.1245116
     Min.   1st Qu.    Median      Mean   3rd Qu.      Max.      NA's 
-0.007271  0.043952  0.095479  0.124512  0.172969  0.791881         1 
Fst estimation on genotypes:
Excluding 365 SNPs on non-autosomes
Excluding 1 SNP (monomorphic: TRUE, MAF: NaN, missing rate: NaN)
Working space: 279 samples, 8,722 SNPs
Method: Weir & Hill, 2002
# of Populations: 4
    CEU (92), HCB (47), JPT (47), YRI (93)
[1] 0.133837
[1] 0.1205067
            CEU         HCB         JPT         YRI
CEU  0.11405722  0.03754189  0.03604618 -0.08262898
HCB  0.03754189  0.17998819  0.17562234 -0.08375578
JPT  0.03604618  0.17562234  0.18375568 -0.08282565
YRI -0.08262898 -0.08375578 -0.08282565  0.05754685
     Min.   1st Qu.    Median      Mean   3rd Qu.      Max.      NA's 
-0.007787  0.043225  0.092317  0.120507  0.166544  0.805405         1 

SNPRelate documentation built on Nov. 8, 2020, 5:31 p.m.