gtx-package: Genetics ToolboX

Description Author(s)

Description

This package implements assorted tools for genetic association analyses, which is viewed as being entirely an exercise in regressing a (possibly multivariate) phenotypic “response variable” onto one or more “explanatory variables” that include genetic variables.

Currently, this package does not provide computationally efficient functions for genetic association analyses at a genome wide scale (genome wide association studies; GWAS). These are already provided by other R packages and by standalone software such as PLINK. Rather, the focus of this package is to provide functions for analysing and manipulating phenotype data before conducting a GWAS (“pre-GWAS”), and on functions for analysing summary statistics resulting from a GWAS (“post-GWAS”). Many of the “post-GWAS” functions implement regression analyses using summary statistics, which are intended to closely approximate results that would be obtained by directly analysing the subject-specific genotype and phenotype data.

Functions for “pre-GWAS” analyses include functions useful for deriving response variables from phenotype data, especially response variables for pharmacogenetic analyses derived from clinical trial phenotype data; functions for power analyses; and functions for annotating and plotting results.

Functions for “post-GWAS” analyses currently support calculation of approximate Bayes factors; multi-SNP risk score analyses; multi-SNP conditional regression analyses; and multi-phenotype analyses.

Approximate Bayes factors can be calculated using abf.Wakefield, abf.normal and abf.t.

For multi-SNP risk score analyses, the main functions for analysing summary statistics are grs.summary, grs.plot and grs.filter.Qrs. The summary statistics necessary for these analyses are single SNP association statistics, which can be calculated using a wide variety of existing tools for GWAS analysis and meta-analysis.

For multi-SNP conditional or multiple regression analyses, the main functions for performing multiple regression using summary statistics are combine.moments2, est.moments2, lm.moments2 and stepup.moments2. The summary statistics necessary for these analyses can be calculated from subject-specific genotype and phenotype data, using the function make.moments2.

Multi-phenotype analyses can be performed using multipheno.T2 and multipheno.OBrien.

In addition, there are “helper” functions for reading and manipulating subject-specific genotype and phenotype data, and which provide a convenient interface from R to genotype data exported from PLINK, and imputed genotype data generated by MACH, minimac, or IMPUTE. These provide a platform for calculating the necessary summary statistics, and for performing “exact” analyses to validate some of the approximate summary statistic based methods. The main functions provided are read.snpdata.plink, read.snpdata.mach, read.snpdata.minimac, and read.snpdata.impute.

Author(s)

Toby Johnson Toby.x.Johnson@gsk.com


tobyjohnson/gtx documentation built on Aug. 30, 2019, 8:07 p.m.