zhengxwen/SNPRelate: Parallel Computing Toolset for Relatedness and Principal Component Analysis of SNP Data

Genome-wide association studies (GWAS) are widely used to investigate the genetic basis of diseases and traits, but they pose many computational challenges. We developed an R package SNPRelate to provide a binary format for single-nucleotide polymorphism (SNP) data in GWAS utilizing CoreArray Genomic Data Structure (GDS) data files. The GDS format offers the efficient operations specifically designed for integers with two bits, since a SNP could occupy only two bits. SNPRelate is also designed to accelerate two key computations on SNP data using parallel computing for multi-core symmetric multiprocessing computer architectures: Principal Component Analysis (PCA) and relatedness analysis using Identity-By-Descent measures. The SNP GDS format is also used by the GWASTools package with the support of S4 classes and generic functions. The extended GDS format is implemented in the SeqArray package to support the storage of single nucleotide variations (SNVs), insertion/deletion polymorphism (indel) and structural variation calls in whole-genome and whole-exome variant data.

Getting started

Package details

Bioconductor views Genetics Infrastructure PrincipalComponent StatisticalMethod
Maintainer
LicenseGPL-3
Version1.40.0
URL https://github.com/zhengxwen/SNPRelate
Package repositoryView on GitHub
Installation Install the latest version of this package by entering the following in R:
install.packages("remotes")
remotes::install_github("zhengxwen/SNPRelate")
zhengxwen/SNPRelate documentation built on Nov. 19, 2024, 1:02 p.m.