Introduction to gtexr

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  eval = (Sys.getenv("RUN_VIGNETTES") != "")
)

The GTEx Portal API V2 enables programmatic access to data available from the Genotype-Tissue Expression Portal. The gtexr package wraps this API, providing R functions that correspond to each API endpoint:

[^1]: With the exception of get_sample_biobank_data() and get_sample_datasets(), for which 'get_sample' is additionally appended with their respective category titles 'biobank_data' and 'datasets'.

Shiny app

Users can try out all functions interatively with the ⭐gtexr shiny app⭐, which pre-populates query parameters with those for the first working example from each function's documentation. To run the app locally:

shiny::runApp(system.file("app", package = "gtexr"))

Paginated responses

library(gtexr)
library(dplyr)
library(purrr)

Many API endpoints return only the first 250 available items by default. A warning is raised if the number of available items exceeds the selected maximum page size e.g.

get_eqtl_genes("Whole_Blood")

For most cases, the simplest solution is to increase the value of itemsPerPage e.g. get_eqtl_genes("Whole_Blood", itemsPerPage = 100000). This limit can be set globally by setting the "gtexr.itemsPerPage" option with options(list(gtexr.itemsPerPage = 100000)).

Alternatively, multiple pages can be retrieved sequentially e.g.

# to retrieve the first 3 pages, with default setting of 250 items per page
1:3 |>
  map(\(page) get_eqtl_genes("Whole_Blood", page = page, .verbose = FALSE) |>
        suppressWarnings()) |>
  bind_rows()

Note that paging information is printed to the R console by default. Set argument .verbose to FALSE to silence these messages, or disable globally with options(list(gtexr.verbose = FALSE)).

Examples

The rest of this vignette outlines some example applications of gtexr.

Get build 37 coordinates for a variant

get_variant(snpId = "rs1410858") |>
  tidyr::separate(
    col = b37VariantId,
    into = c(
      "chromosome",
      "position",
      "reference_allele",
      "alternative_allele",
      "genome_build"
    ),
    sep = "_",
    remove = FALSE
  ) |>
  select(snpId:genome_build)

Convert gene symbol to versioned GENCODE ID

Use get_gene() or get_genes()

get_genes("CRP") |>
  select(geneSymbol, gencodeId)

Convert rsID to GTEx variant ID

Use get_variant()

get_variant(snpId = "rs1410858") |>
  select(snpId, variantId)

For a gene of interest, which tissues have significant cis-eQTLs?

Use get_significant_single_tissue_eqtls() (note this requires versioned GENCODE IDs)

gene_symbol_of_interest <- "CRP"

gene_gencodeId_of_interest <- get_genes(gene_symbol_of_interest) |>
  pull(gencodeId) |>
  suppressMessages()

gene_gencodeId_of_interest |>
  get_significant_single_tissue_eqtls() |>
  distinct(geneSymbol, gencodeId, tissueSiteDetailId)

Get data for non-eQTL variants

Some analyses (e.g. Mendelian randomisation) require data for variants which may or may not be significant eQTLs. Use calculate_expression_quantitative_trait_loci() with purrr::map() to retrieve data for multiple variants

variants_of_interest <- c("rs12119111", "rs6605071", "rs1053870")

variants_of_interest |>
  set_names() |>
  map(
    \(x) calculate_expression_quantitative_trait_loci(
      tissueSiteDetailId = "Liver",
      gencodeId = "ENSG00000237973.1",
      variantId = x
    )
  ) |>
  bind_rows(.id = "rsid") |>
  # optionally, reformat output - first extract genomic coordinates and alleles
  tidyr::separate(
    col = "variantId",
    into = c(
      "chromosome",
      "position",
      "reference_allele",
      "alternative_allele",
      "genome_build"
    ),
    sep = "_"
  ) |>
  # ...then ascertain alternative_allele frequency
  mutate(
    alt_allele_count = (2 * homoAltCount) + hetCount,
    total_allele_count = 2 * (homoAltCount + hetCount + homoRefCount),
    alternative_allele_frequency = alt_allele_count / total_allele_count
  ) |>
  select(
    rsid,
    beta = nes,
    se = error,
    pValue,
    minor_allele_frequency = maf,
    alternative_allele_frequency,
    chromosome:genome_build,
    tissueSiteDetailId
  )


Try the gtexr package in your browser

Any scripts or data that you put into this service are public.

gtexr documentation built on June 8, 2025, 10:26 a.m.