SNPRelate: Parallel Computing Toolset for Relatedness and Principal Component Analysis of SNP Data

Genome-wide association studies (GWAS) are widely used to investigate the genetic basis of diseases and traits, but they pose many computational challenges. We developed an R package SNPRelate to provide a binary format for single-nucleotide polymorphism (SNP) data in GWAS utilizing CoreArray Genomic Data Structure (GDS) data files. The GDS format offers the efficient operations specifically designed for integers with two bits, since a SNP could occupy only two bits. SNPRelate is also designed to accelerate two key computations on SNP data using parallel computing for multi-core symmetric multiprocessing computer architectures: Principal Component Analysis (PCA) and relatedness analysis using Identity-By-Descent measures. The SNP GDS format is also used by the GWASTools package with the support of S4 classes and generic functions. The extended GDS format is implemented in the SeqArray package to support the storage of single nucleotide variations (SNVs), insertion/deletion polymorphism (indel) and structural variation calls in whole-genome and whole-exome variant data.

Package details

AuthorXiuwen Zheng [aut, cre, cph] (<https://orcid.org/0000-0002-1390-0708>), Stephanie Gogarten [ctb], Cathy Laurie [ctb], Bruce Weir [ctb, ths] (<https://orcid.org/0000-0002-4883-1247>)
Bioconductor views Genetics Infrastructure PrincipalComponent StatisticalMethod
MaintainerXiuwen Zheng <zhengx@u.washington.edu>
LicenseGPL-3
Version1.24.0
URL http://github.com/zhengxwen/SNPRelate
Package repositoryView on Bioconductor
Installation Install the latest version of this package by entering the following in R:
if (!requireNamespace("BiocManager", quietly = TRUE))
    install.packages("BiocManager")

BiocManager::install("SNPRelate")

Try the SNPRelate package in your browser

Any scripts or data that you put into this service are public.

SNPRelate documentation built on Nov. 8, 2020, 5:31 p.m.