GenABEL: an R package for Genome Wide Association Analysis
Genome-wide association (GWA) analysis is a tool of choice for identification of genes for complex traits. Effective storage, handling and analysis of GWA data represent a challenge to modern computational genetics. GWA studies generate large amount of data: hundreds of thousands of single nucleotide polymorphisms (SNPs) are genotyped in hundreds or thousands of patients and controls. Data on each SNP undergoes several types of analysis: characterization of frequency distribution, testing of Hardy-Weinberg equilibrium, analysis of association between single SNPs and haplotypes and different traits, and so on. Because SNP genotypes in dense marker sets are correlated, significance testing in GWA analysis is preferably performed using computationally intensive permutation test procedures, further increasing the computational burden.
To make GWA analysis possible on standard desktop computers we developed GenABEL library which addresses the following objectives:
(1) Minimization of the amount of rapid access memory (RAM) used and the time required for data transactions. For this, we developed an effective data storage and manipulation model.
(2) Maximization of the throughput of GWA analysis. For this, we designed optimal fast procedures for specific genetic tests.
Embedding GenABEL into R environment allows for easy data characterization, exploration and presentation of the results and gives access to a wide range of standard and special statistical analysis functions available in base R and specific R packages, such as "haplo.stats", "genetics", etc.
To see (more or less complete) functionality of GenABEL, try running
Other demo of interest could be run with demo(srdta). Depending on your user priveleges in Windows, it may well not run. In this case, try demo(srdtawin).
The most important functions and classes are:
For converting data from other formats, see
(Illumina/Affymetrix-like format). This is our preferred
converting function, very extensively tested. Other
conversion functions include:
convert.snp.text (conversion from
human-readable GenABEL format),
convert.snp.ped (Linkage, Merlin, Mach, and
convert.snp.tped (from PLINK
For converting of GenABEL's data to other formats, see
export.merlin (MERLIN and MACH formats),
export.impute (IMPUTE, SNPTEST and CHIAMO
export.plink (PLINK format, also
exports phenotypic data).
To load the data, see
For conversion to DatABEL format (used by ProbABEL and
some other GenABEL suite packages), see
For data managment and manipulations see
For merging extra data to the phenotypic part of
gwaa.data-class object, see
For traits manipulations see
(transformation to standard Normal),
rntransform (rank-transformation to
routine to "impute" trait's values in these medicated).
For quality control, see
For fast analysis function, see
r2fast (estimate linkage disequilibrium
dprfast (estimate linkage
disequilibrium using D'),
linkage disequilibrium using 'rho')
For specific tools facilitating analysis of the data with
stratification (population stratification or (possibly
unknown) pedigree structure), see
(implements basic Genomic Control),
(computations of IBS / genomic IBD),
egscore (stratification adjustment
following Price et al.),
(another function for heritability analysis),
mmscore (score test of Chen and Abecasis),
grammar (grammar, grammar-gc, and
garmmar-gamma tests of Aulchenko et al., Amin et al., and
Svishcheva et al.).
For functions facilitating construction of tables for
your manuscript, see
For functions recunstructing relationships from genomic
For meta-analysis and related, see help on
For link to WEB databases, see
For interfaces to other packages and standard R
functions, also for 2D scans, see
For graphical facilities, see
Yurii Aulchenko et al. (see help pages for specific functions)
If you use GenABEL package in your analysis, please cite the following work:
Aulchenko Y.S., Ripke S., Isaacs A., van Duijn C.M. GenABEL: an R package for genome-wide association analysis. Bioinformatics. 2007 23(10):1294-6.
If you used
polygenic, please cite
Thompson EA, Shaw RG (1990) Pedigree analysis for quantitative traits: variance components without matrix inversion. Biometrics 46, 399-413.
If you used environmental residuals from
grammar, please cite
for original GRAMMAR
Aulchenko YS, de Koning DJ, Haley C. Genomewide rapid association using mixed model and regression: a fast and simple method for genome-wide pedigree-based quantitative trait loci association analysis. Genetics. 2007 177(1):577-85.
Amin N, van Duijn CM, Aulchenko YS. A genomic background based method for association analysis in related individuals. PLoS ONE. 2007 Dec 5;2(12):e1274.
Svischeva G, Axenovich TI, Belonogova NM, van Duijn CM, Aulchenko YS. Rapid variance components-based method for whole-genome association analysis. Nature Genetics. 2012 44:1166-1170. doi:10.1038/ng.2410
for GRAMMAR+ transformation
Belonogova NM, Svishcheva GR, van Duijn CM, Aulchenko YS, Axenovich TI (2013) Region-Based Association Analysis of Human Quantitative Traits in Related Individuals. PLoS ONE 8(6): e65395. doi:10.1371/journal.pone.0065395
If you used
mmscore, please cite
Chen WM, Abecasis GR. Family-based association tests for genome-wide association scans. Am J Hum Genet. 2007 Nov;81(5):913-26.
For exact HWE (used in
Wigginton G.E., Cutler D.J., Abecasis G.R. A note on exact tests of Hardy-Weinberg equilibrium. Am J Hum Genet. 2005 76: 887-893.
For haplo.stats (
scan.haplo.2D), please cite:
Schaid DJ, Rowland CM, Tines DE, Jacobson RM, Poland GA. Score tests for association between traits and haplotypes when linkage phase is ambiguous. Am J Hum Genet. 2002 70:425-434.
For fast LD computations (function
r2fast), please cite:
Hao K, Di X, Cawley S. LdCompare: rapid computation of single- and multiple-marker r2 and genetic coverage. Bioinformatics. 2006 23:252-254.
If you used
npsubtreated, please cite
Levy D, DeStefano AL, Larson MG, O'Donnell CJ, Lifton RP, Gavras H, Cupples LA, Myers RH. Evidence for a gene influencing blood pressure on chromosome 17. Genome scan linkage results for longitudinal blood pressure phenotypes in subjects from the framingham heart study. Hypertension. 2000 Oct;36(4):477-83.
1 2 3 4 5 6
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.