knitr::opts_chunk$set( collapse = TRUE, comment = "#>", eval = (Sys.getenv("RUN_VIGNETTES") != "") )
The GTEx Portal API V2 enables programmatic access to data available from the Genotype-Tissue Expression Portal. The gtexr package wraps this API, providing R functions that correspond to each API endpoint:
get_service_info()
.get_maintenance_message()
(corresponding to the endpoint "Get Maintenance Message") are page
and itemsPerPage
. For query parameters that accept an array of values however, the corresponding function argument is pluralised to indicate this e.g. for endpoint "Get Eqtl Genes" the query parameter 'tissueSiteDetailId' is pluralised to argument name tissueSiteDetailIds
in get_eqtl_genes()
.?get_eqtl_genes
provides example valid values for the required argument tissueSiteDetailIds
.tibble::tibble
by default. Alternatively, the raw JSON from an API call may be retrieved by setting argument .return_raw
to TRUE
e.g. get_service_info(.return_raw = TRUE)
.[^1]: With the exception of get_sample_biobank_data()
and get_sample_datasets()
, for which 'get_sample' is additionally appended with their respective category titles 'biobank_data' and 'datasets'.
Users can try out all functions interatively with the ⭐gtexr shiny app⭐, which pre-populates query parameters with those for the first working example from each function's documentation. To run the app locally:
shiny::runApp(system.file("app", package = "gtexr"))
library(gtexr) library(dplyr) library(purrr)
Many API endpoints return only the first 250 available items by default. A warning is raised if the number of available items exceeds the selected maximum page size e.g.
get_eqtl_genes("Whole_Blood")
For most cases, the simplest solution is to increase the value of itemsPerPage
e.g. get_eqtl_genes("Whole_Blood", itemsPerPage = 100000)
. This limit can be set globally by setting the "gtexr.itemsPerPage" option with options(list(gtexr.itemsPerPage = 100000))
.
Alternatively, multiple pages can be retrieved sequentially e.g.
# to retrieve the first 3 pages, with default setting of 250 items per page 1:3 |> map(\(page) get_eqtl_genes("Whole_Blood", page = page, .verbose = FALSE) |> suppressWarnings()) |> bind_rows()
Note that paging information is printed to the R console by default. Set argument .verbose
to FALSE
to silence these messages, or disable globally with options(list(gtexr.verbose = FALSE))
.
The rest of this vignette outlines some example applications of gtexr.
get_variant(snpId = "rs1410858") |> tidyr::separate( col = b37VariantId, into = c( "chromosome", "position", "reference_allele", "alternative_allele", "genome_build" ), sep = "_", remove = FALSE ) |> select(snpId:genome_build)
Use get_gene()
or get_genes()
get_genes("CRP") |> select(geneSymbol, gencodeId)
Use get_variant()
get_variant(snpId = "rs1410858") |> select(snpId, variantId)
Use get_significant_single_tissue_eqtls()
(note this requires versioned GENCODE IDs)
gene_symbol_of_interest <- "CRP" gene_gencodeId_of_interest <- get_genes(gene_symbol_of_interest) |> pull(gencodeId) |> suppressMessages() gene_gencodeId_of_interest |> get_significant_single_tissue_eqtls() |> distinct(geneSymbol, gencodeId, tissueSiteDetailId)
Some analyses (e.g. Mendelian randomisation) require data for variants which may or may not be significant eQTLs. Use calculate_expression_quantitative_trait_loci()
with purrr::map()
to retrieve data for multiple variants
variants_of_interest <- c("rs12119111", "rs6605071", "rs1053870") variants_of_interest |> set_names() |> map( \(x) calculate_expression_quantitative_trait_loci( tissueSiteDetailId = "Liver", gencodeId = "ENSG00000237973.1", variantId = x ) ) |> bind_rows(.id = "rsid") |> # optionally, reformat output - first extract genomic coordinates and alleles tidyr::separate( col = "variantId", into = c( "chromosome", "position", "reference_allele", "alternative_allele", "genome_build" ), sep = "_" ) |> # ...then ascertain alternative_allele frequency mutate( alt_allele_count = (2 * homoAltCount) + hetCount, total_allele_count = 2 * (homoAltCount + hetCount + homoRefCount), alternative_allele_frequency = alt_allele_count / total_allele_count ) |> select( rsid, beta = nes, se = error, pValue, minor_allele_frequency = maf, alternative_allele_frequency, chromosome:genome_build, tissueSiteDetailId )
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.