GWAS catalog: Phenotypes systematized by the experimental factor ontology

suppressPackageStartupMessages({
library(gwascat)
})

Introduction

The EMBL-EBI curation of the GWAS catalog (originated at NHGRI) includes labelings of GWAS hit records with terms from the EBI Experimental Factor Ontology (EFO). The Bioconductor r Biocpkg("gwascat") package includes a r Biocpkg("graph") representation of the ontology and records the EFO assignments of GWAS results in its basic representations of the catalog.

Views of the EFO

Term names are regimented.

library(gwascat)
requireNamespace("graph")
data(efo.obo.g)
efo.obo.g
sn = head(graph::nodes(efo.obo.g))
sn

The nodeData of the graph includes a def field. We will process that to create a data.frame.

nd = graph::nodeData(efo.obo.g)
alldef = sapply(nd, function(x) unlist(x[["def"]]))
allnames = sapply(nd, function(x) unlist(x[["name"]]))
alld2 = sapply(alldef, function(x) if(is.null(x)) return(" ") else x[1])
mydf = data.frame(id = names(allnames), concept=as.character(allnames), def=unlist(alld2))

We can create an interactive data table for all terms, but for performance we limit the table size to terms involving the string 'autoimm'.

limdf = mydf[ grep("autoimm", mydf$def, ignore.case=TRUE), ]
requireNamespace("DT")
suppressWarnings({
DT::datatable(limdf, rownames=FALSE, options=list(pageLength=5))
})

Graph operations

The use of the graph representation allows various approaches to traversal and selection. Here we examine metadata for a term of interest, transform to an undirected graph, and obtain the adjacency list for that term.

graph::nodeData(efo.obo.g, "EFO:0000540")
ue = graph::ugraph(efo.obo.g)
neighISD = graph::adj(ue, "EFO:0000540")[[1]]
sapply(graph::nodeData(graph::subGraph(neighISD, efo.obo.g)), "[[", "name")

With RBGL we can compute paths to terms from root.

requireNamespace("RBGL")
p = RBGL::sp.between( efo.obo.g, "EFO:0000685", "EFO:0000001")
sapply(graph::nodeData(graph::subGraph(p[[1]]$path_detail, efo.obo.g)), "[[", "name")

Connections to the GWAS catalog

The mcols element of the GRanges instances provided by gwascat include mapped EFO terms and EFO URIs.

data(ebicat_2020_04_30)
names(S4Vectors::mcols(ebicat_2020_04_30))
adinds = grep("autoimmu", ebicat_2020_04_30$MAPPED_TRAIT)
adgr = ebicat_2020_04_30[adinds]
adgr
S4Vectors::mcols(adgr)[, c("MAPPED_TRAIT", "MAPPED_TRAIT_URI")]


Try the gwascat package in your browser

Any scripts or data that you put into this service are public.

gwascat documentation built on Nov. 8, 2020, 11:08 p.m.