library(knitr) opts_chunk$set(fig.width=7, fig.height=4.5, dev.args=list(pointsize=16)) options(width=95)
The GWASapi package provides access to the NHGRI-EBI catalog of GWAS summary statistics. For details on the API, see its documentation, as well as Pjotr Prins's documentation at github.
You can install GWASapi from GitHub.
You first need to install the devtools.
install.packages("devtools")
Then use devtools::install_github()
to install GWASapi.
library(devtools) install_github("rqtl/GWASapi")
Load the package with library()
.
library(GWASapi)
The purpose of the GWASapi package is to provide access to summary statistics for human GWAS. First, you can get lists of studies and traits that are available.
To get lists of studies, use list_studies()
. The default is to
return just 20 studies. You can control that limit with the argument
size
. You can also use start
to step through the full set.
list_studies(size=5)
To retrieve all studies, set a higher limit
all_studies <- list_studies(size=1000) length(all_studies)
To get a list of traits, use list_traits()
. Again the default is to
return just 20 values. To get all traits, use the size
argument.
all_traits <- list_traits(size=1000) length(all_traits)
The traits are returned as identifiers like r all_traits[1]
. To get
a description of a trait, you can use the ontology lookup service, for
example https://www.ebi.ac.uk/efo/EFO_0001360
Note that the traits returned are not all distinct.
table(table(all_traits))
Chromosomes are stored as integers 1-24.
list_chr()
To get associations for a specific variant by its rs-number, use
get_variant()
. If you know the chromosome it is on, you'll get
faster results by providing the chromosome. And again, the default is
to return just 20 values, so use the size
and start
arguments if
you want a comprehensive list.
result <- get_variant("rs2228603", 19, size=5) result[,c("p_value", "study_accession", "trait")]
Use the arguments p_lower
and p_upper
to focus on associations
with p-value in a specified range. For example, to get all of the
associations with p-value < 10^-10^, you would do:
result <- get_variant("rs2228603", 19, p_upper=1e-10) result[,c("p_value", "study_accession", "trait")]
To get associations for a specific region, use get_asso()
.
For example, to get the region from 19.2 Mbp to 19.3 Mbp on chr 19:
result <- get_asso(chr=19, bp_lower=19200000, bp_upper=19300000) result[,c("chromosome", "base_pair_location", "p_value", "study_accession", "trait")]
You can restrict those results to a particular study.
result <- get_asso(chr=19, bp_lower=19200000, bp_upper=19300000, study="GCST000392") result[,c("chromosome", "base_pair_location", "p_value", "study_accession", "trait")]
To get associations for a given trait, use get_trait_asso()
. You
can't restrict this to a given chromosome region.
result <- get_trait_asso("EFO_0001360", p_upper=1e-100, size=1000) nrow(result) result[1:5, c("chromosome", "base_pair_location", "p_value", "study_accession", "trait")]
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.