snpgdsIBS: Identity-By-State (IBS) proportion

View source: R/IBS.R

snpgdsIBSR Documentation

Identity-By-State (IBS) proportion

Description

Calculate the fraction of identity by state for each pair of samples

Usage

snpgdsIBS(gdsobj, sample.id=NULL, snp.id=NULL, autosome.only=TRUE,
    remove.monosnp=TRUE, maf=NaN, missing.rate=NaN, num.thread=1L,
    useMatrix=FALSE, verbose=TRUE)

Arguments

gdsobj

an object of class SNPGDSFileClass, a SNP GDS file

sample.id

a vector of sample id specifying selected samples; if NULL, all samples are used

snp.id

a vector of snp id specifying selected SNPs; if NULL, all SNPs are used

autosome.only

if TRUE, use autosomal SNPs only; if it is a numeric or character value, keep SNPs according to the specified chromosome

remove.monosnp

if TRUE, remove monomorphic SNPs

maf

to use the SNPs with ">= maf" only; if NaN, no MAF threshold

missing.rate

to use the SNPs with "<= missing.rate" only; if NaN, no missing threshold

num.thread

the number of (CPU) cores used; if NA, detect the number of cores automatically

useMatrix

if TRUE, use Matrix::dspMatrix to store the output square matrix to save memory

verbose

if TRUE, show information

Details

The minor allele frequency and missing rate for each SNP passed in snp.id are calculated over all the samples in sample.id.

The values of the IBS matrix range from ZERO to ONE, and it is defined as the average of 1 - | g_{1,i} - g_{2,i} | / 2 across the genome for the first and second individuals and SNP i.

Value

Return a list (class "snpgdsIBSClass"):

sample.id

the sample ids used in the analysis

snp.id

the SNP ids used in the analysis

ibs

a matrix of IBS proportion, "# of samples" x "# of samples"

Author(s)

Xiuwen Zheng

See Also

snpgdsIBSNum

Examples

# open an example dataset (HapMap)
genofile <- snpgdsOpen(snpgdsExampleFileName())

# perform identity-by-state calculations
ibs <- snpgdsIBS(genofile)

# perform multidimensional scaling analysis on
# the genome-wide IBS pairwise distances:
loc <- cmdscale(1 - ibs$ibs, k = 2)
x <- loc[, 1]; y <- loc[, 2]
race <- as.factor(read.gdsn(index.gdsn(genofile, "sample.annot/pop.group")))
plot(x, y, col=race, xlab = "", ylab = "", main = "cmdscale(IBS Distance)")
legend("topleft", legend=levels(race), text.col=1:nlevels(race))

# close the file
snpgdsClose(genofile)

zhengxwen/SNPRelate documentation built on Nov. 19, 2024, 1:02 p.m.