Introduction to gtexr

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>", 
  eval = (Sys.getenv("RUN_VIGNETTES") != "")
)

The GTEx Portal API V2 enables programmatic access to data available from the Genotype-Tissue Expression Portal. The gtexr package wraps this API, providing R functions that correspond to each API endpoint:

Shiny app

Users can try out all functions interatively with the ⭐gtexr shiny app⭐, which pre-populates query parameters with those for the first working example from each function's documentation.

Examples

The rest of this vignette outlines some example applications of gtexr.

library(gtexr)
library(dplyr)
library(purrr)

Get build 37 coordinates for a variant

get_variant(snpId = "rs1410858") |>
  tidyr::separate(
    col = b37VariantId,
    into = c(
      "chromosome",
      "position",
      "reference_allele",
      "alternative_allele",
      "genome_build"
    ),
    sep = "_",
    remove = FALSE
  ) |>
  select(snpId:genome_build)

Convert gene symbol to versioned GENCODE ID

Use get_gene() or get_genes()

get_genes("CRP") |>
  select(geneSymbol, gencodeId)

Convert rsID to GTEx variant ID

Use get_variant()

get_variant(snpId = "rs1410858") |>
  select(snpId, variantId)

For a gene of interest, which tissues have significant cis-eQTLs?

Use get_significant_single_tissue_eqtls() (note this requires versioned GENCODE IDs)

gene_symbol_of_interest <- "CRP"

gene_gencodeId_of_interest <- get_genes(gene_symbol_of_interest) |>
  pull(gencodeId) |>
  suppressMessages()

gene_gencodeId_of_interest |>
  get_significant_single_tissue_eqtls() |>
  distinct(geneSymbol, gencodeId, tissueSiteDetailId)

Get data for non-eQTL variants

Some analyses (e.g. Mendelian randomisation) require data for variants which may or may not be significant eQTLs. Use calculate_expression_quantitative_trait_loci() with purrr::map() to retrieve data for multiple variants

variants_of_interest <- c("rs12119111", "rs6605071", "rs1053870")

variants_of_interest |>
  set_names() |>
  map(
    \(x) calculate_expression_quantitative_trait_loci(
      tissueSiteDetailId = "Liver",
      gencodeId = "ENSG00000237973.1",
      variantId = x
    )
  ) |>
  bind_rows(.id = "rsid") |>

  # optionally, reformat output - first extract genomic coordinates and alleles
  tidyr::separate(
    col = "variantId",
    into = c(
      "chromosome",
      "position",
      "reference_allele",
      "alternative_allele",
      "genome_build"
    ),
    sep = "_"
  ) |>

  # ...then ascertain alternative_allele frequency
  mutate(
    alt_allele_count = (2 * homoAltCount) + hetCount,
    total_allele_count = 2 * (homoAltCount + hetCount +  homoRefCount),
    alternative_allele_frequency = alt_allele_count / total_allele_count
  ) |>

  select(
    rsid,
    beta = nes,
    se = error,
    pValue,
    minor_allele_frequency = maf,
    alternative_allele_frequency,
    chromosome:genome_build,
    tissueSiteDetailId
  )


Try the gtexr package in your browser

Any scripts or data that you put into this service are public.

gtexr documentation built on Sept. 19, 2024, 5:06 p.m.