snpgdsIBDKING: KING method of moment for the identity-by-descent (IBD)...

Description Usage Arguments Details Value Author(s) References See Also Examples

View source: R/IBD.R

Description

Calculate IBD coefficients by KING method of moment.

Usage

1
2
3
4
snpgdsIBDKING(gdsobj, sample.id=NULL, snp.id=NULL, autosome.only=TRUE,
    remove.monosnp=TRUE, maf=NaN, missing.rate=NaN,
    type=c("KING-robust", "KING-homo"), family.id=NULL, num.thread=1L,
    useMatrix=FALSE, verbose=TRUE)

Arguments

gdsobj

an object of class SNPGDSFileClass, a SNP GDS file

sample.id

a vector of sample id specifying selected samples; if NULL, all samples are used

snp.id

a vector of snp id specifying selected SNPs; if NULL, all SNPs are used

autosome.only

if TRUE, use autosomal SNPs only; if it is a numeric or character value, keep SNPs according to the specified chromosome

remove.monosnp

if TRUE, remove monomorphic SNPs

maf

to use the SNPs with ">= maf" only; if NaN, no MAF threshold

missing.rate

to use the SNPs with "<= missing.rate" only; if NaN, no missing threshold

type

"KING-robust" – relationship inference in the presence of population stratification; "KING-homo" – relationship inference in a homogeneous population

family.id

if NULL, all individuals are treated as singletons; if family id is given, within- and between-family relationship are estimated differently. If sample.id=NULL, family.id should have the same length as "sample.id" in the GDS file, otherwise family.id should have the same length and order as the argument sample.id

num.thread

the number of (CPU) cores used; if NA, detect the number of cores automatically

useMatrix

if TRUE, use Matrix::dspMatrix to store the output square matrix to save memory

verbose

if TRUE, show information

Details

KING IBD estimator is a moment estimator, and it is computationally efficient relative to MLE method. The approaches include "KING-robust" – robust relationship inference within or across families in the presence of population substructure, and "KING-homo" – relationship inference in a homogeneous population.

With "KING-robust", the function would return the proportion of SNPs with zero IBS (IBS0) and kinship coefficient (kinship). With "KING-homo" it would return the probability of sharing one IBD (k1) and the probability of sharing zero IBD (k0).

The minor allele frequency and missing rate for each SNP passed in snp.id are calculated over all the samples in sample.id.

Value

Return a list:

sample.id

the sample ids used in the analysis

snp.id

the SNP ids used in the analysis

k0

IBD coefficient, the probability of sharing zero IBD

k1

IBD coefficient, the probability of sharing one IBD

IBS0

proportion of SNPs with zero IBS

kinship

the estimated kinship coefficients, if the parameter kinship=TRUE

Author(s)

Xiuwen Zheng

References

Manichaikul A, Mychaleckyj JC, Rich SS, Daly K, Sale M, Chen WM. Robust relationship inference in genome-wide association studies. Bioinformatics. 2010 Nov 15;26(22):2867-73.

See Also

snpgdsIBDMLE, snpgdsIBDMoM

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
# open an example dataset (HapMap)
genofile <- snpgdsOpen(snpgdsExampleFileName())

# CEU population
samp.id <- read.gdsn(index.gdsn(genofile, "sample.id"))
CEU.id <- samp.id[
    read.gdsn(index.gdsn(genofile, "sample.annot/pop.group"))=="CEU"]



####  KING-robust:
####  relationship inference in the presence of population stratification
####               robust relationship inference across family

ibd.robust <- snpgdsIBDKING(genofile, sample.id=CEU.id)
names(ibd.robust)
# [1] "sample.id" "snp.id"    "afreq"     "IBS0"      "kinship"

# select a set of pairs of individuals
dat <- snpgdsIBDSelection(ibd.robust, 1/32)
head(dat)

plot(dat$IBS0, dat$kinship, xlab="Proportion of Zero IBS",
    ylab="Estimated Kinship Coefficient (KING-robust)")


# using Matrix
ibd.robust <- snpgdsIBDKING(genofile, sample.id=CEU.id, useMatrix=TRUE)
is(ibd.robust$IBS0)  # dspMatrix
is(ibd.robust$kinship)  # dspMatrix



####  KING-robust:
####  relationship inference in the presence of population stratification
####               within- and between-family relationship inference

# incorporate with pedigree information
family.id <- read.gdsn(index.gdsn(genofile, "sample.annot/family.id"))
family.id <- family.id[match(CEU.id, samp.id)]

ibd.robust2 <- snpgdsIBDKING(genofile, sample.id=CEU.id, family.id=family.id)
names(ibd.robust2)

# select a set of pairs of individuals
dat <- snpgdsIBDSelection(ibd.robust2, 1/32)
head(dat)

plot(dat$IBS0, dat$kinship, xlab="Proportion of Zero IBS",
    ylab="Estimated Kinship Coefficient (KING-robust)")



####  KING-homo: relationship inference in a homogeneous population

ibd.homo <- snpgdsIBDKING(genofile, sample.id=CEU.id, type="KING-homo")
names(ibd.homo)
# "sample.id" "snp.id"    "afreq"     "k0"        "k1"

# select a subset of pairs of individuals
dat <- snpgdsIBDSelection(ibd.homo, 1/32)
head(dat)

plot(dat$k0, dat$kinship, xlab="Pr(IBD=0)",
    ylab="Estimated Kinship Coefficient (KING-homo)")


# using Matrix
ibd.homo <- snpgdsIBDKING(genofile, sample.id=CEU.id, type="KING-homo",
    useMatrix=TRUE)
is(ibd.homo$k0)  # dspMatrix
is(ibd.homo$k1)  # dspMatrix


# close the genotype file
snpgdsClose(genofile)

Example output

Loading required package: gdsfmt
SNPRelate -- supported by Streaming SIMD Extensions 2 (SSE2)
IBD analysis (KING method of moment) on genotypes:
Excluding 365 SNPs on non-autosomes
Excluding 1,217 SNPs (monomorphic: TRUE, MAF: NaN, missing rate: NaN)
    # of samples: 92
    # of SNPs: 7,506
    using 1 thread
No family is specified, and all individuals are treated as singletons.
Relationship inference in the presence of population stratification.
KING IBD:    the sum of all selected genotypes (0,1,2) = 702139
CPU capabilities: Double-Precision SSE2
Fri Apr 23 09:38:25 2021    (internal increment: 39808)

[..................................................]  0%, ETC: ---        
[==================================================] 100%, completed, 0s
Fri Apr 23 09:38:25 2021    Done.
[1] "sample.id" "snp.id"    "afreq"     "IBS0"      "kinship"  
      ID1     ID2         IBS0    kinship
1 NA07034 NA07048 0.0001342102 0.24495171
2 NA07034 NA12873 0.0520145533 0.03284983
3 NA07055 NA07048 0.0000000000 0.25153644
4 NA12814 NA12802 0.0002680606 0.25291074
5 NA10847 NA12239 0.0000000000 0.23510079
6 NA10847 NA12146 0.0000000000 0.24659640
IBD analysis (KING method of moment) on genotypes:
Excluding 365 SNPs on non-autosomes
Excluding 1,217 SNPs (monomorphic: TRUE, MAF: NaN, missing rate: NaN)
    # of samples: 92
    # of SNPs: 7,506
    using 1 thread
No family is specified, and all individuals are treated as singletons.
Relationship inference in the presence of population stratification.
KING IBD:    the sum of all selected genotypes (0,1,2) = 702139
CPU capabilities: Double-Precision SSE2
Fri Apr 23 09:38:25 2021    (internal increment: 39808)

[..................................................]  0%, ETC: ---        
[==================================================] 100%, completed, 0s
Fri Apr 23 09:38:25 2021    Done.
 [1] "dspMatrix"       "ddenseMatrix"    "symmetricMatrix" "dMatrix"        
 [5] "denseMatrix"     "compMatrix"      "Matrix"          "xMatrix"        
 [9] "mMatrix"         "Mnumeric"        "replValueSp"    
 [1] "dspMatrix"       "ddenseMatrix"    "symmetricMatrix" "dMatrix"        
 [5] "denseMatrix"     "compMatrix"      "Matrix"          "xMatrix"        
 [9] "mMatrix"         "Mnumeric"        "replValueSp"    
IBD analysis (KING method of moment) on genotypes:
Excluding 365 SNPs on non-autosomes
Excluding 1,217 SNPs (monomorphic: TRUE, MAF: NaN, missing rate: NaN)
    # of samples: 92
    # of SNPs: 7,506
    using 1 thread
# of families: 20, and within- and between-family relationship are estimated differently.
Relationship inference in the presence of population stratification.
KING IBD:    the sum of all selected genotypes (0,1,2) = 702139
CPU capabilities: Double-Precision SSE2
Fri Apr 23 09:38:26 2021    (internal increment: 39808)

[..................................................]  0%, ETC: ---        
[==================================================] 100%, completed, 0s
Fri Apr 23 09:38:26 2021    Done.
[1] "sample.id" "snp.id"    "afreq"     "IBS0"      "kinship"  
      ID1     ID2         IBS0    kinship
1 NA07034 NA07048 0.0001342102 0.24891962
2 NA07034 NA12873 0.0520145533 0.03284983
3 NA07055 NA07048 0.0000000000 0.25305410
4 NA12814 NA12802 0.0002680606 0.25344234
5 NA10847 NA12239 0.0000000000 0.23966408
6 NA10847 NA12146 0.0000000000 0.25000000
IBD analysis (KING method of moment) on genotypes:
Excluding 365 SNPs on non-autosomes
Excluding 1,217 SNPs (monomorphic: TRUE, MAF: NaN, missing rate: NaN)
    # of samples: 92
    # of SNPs: 7,506
    using 1 thread
Relationship inference in a homogeneous population.
KING IBD:    the sum of all selected genotypes (0,1,2) = 702139
Fri Apr 23 09:38:26 2021    (internal increment: 39808)

[..................................................]  0%, ETC: ---        
[==================================================] 100%, completed, 0s
Fri Apr 23 09:38:26 2021    Done.
[1] "sample.id" "snp.id"    "afreq"     "k0"        "k1"       
      ID1     ID2          k0        k1   kinship
1 NA07034 NA07048 0.002253744 1.0058457 0.2474117
2 NA07055 NA07048 0.000000000 0.9828815 0.2542796
3 NA12814 NA12802 0.004503063 0.9863919 0.2511505
4 NA10847 NA12239 0.000000000 1.0529144 0.2367714
5 NA10847 NA12146 0.000000000 1.0054915 0.2486271
6 NA12056 NA10851 0.002247289 0.9991049 0.2491001
IBD analysis (KING method of moment) on genotypes:
Excluding 365 SNPs on non-autosomes
Excluding 1,217 SNPs (monomorphic: TRUE, MAF: NaN, missing rate: NaN)
    # of samples: 92
    # of SNPs: 7,506
    using 1 thread
Relationship inference in a homogeneous population.
KING IBD:    the sum of all selected genotypes (0,1,2) = 702139
Fri Apr 23 09:38:26 2021    (internal increment: 39808)

[..................................................]  0%, ETC: ---        
[==================================================] 100%, completed, 1s
Fri Apr 23 09:38:27 2021    Done.
 [1] "dspMatrix"       "ddenseMatrix"    "symmetricMatrix" "dMatrix"        
 [5] "denseMatrix"     "compMatrix"      "Matrix"          "xMatrix"        
 [9] "mMatrix"         "Mnumeric"        "replValueSp"    
 [1] "dspMatrix"       "ddenseMatrix"    "symmetricMatrix" "dMatrix"        
 [5] "denseMatrix"     "compMatrix"      "Matrix"          "xMatrix"        
 [9] "mMatrix"         "Mnumeric"        "replValueSp"    

SNPRelate documentation built on Nov. 8, 2020, 5:31 p.m.