In EricEdwardBryant/biogridr: BioGRID Interaction Data

Welcome to the biogrid package!

This package contains a SQLite database of BioGRID interactions. This vignette will demonstrate how to access this data.

Database access

There are four tables

library(biogrid)
# interactions table is returned by default
interactions <- biogrid()             # Interactions
organisms    <- biogrid('organisms')  # NCBI organism IDs
systems      <- biogrid('systems')    # Experimental systems
log          <- biogrid('log')        # Record of all downloads

At this point you're set! All you need is a description of the table variables and you're on your way to selecting the interactions relevant to your experiment!

But... the interactions table is huge! What if I don't want to load the whole thing into memory?! We can do this by setting collect = FALSE and using dplyr to query the database. Don't be scared by all those %>%s (pronounced 'pipe')! Using x %>% y is equivalent to f(x, y) and just makes the syntax easier to read as a procedure, or 'pipeline', instead of a bunch of nested function calls, or a bunch of temporary variable assignments.

library(dplyr)

# I want the NCBI ID for S. cerevisiae!
yeastId <- organisms %>% 
  filter(grepl('cerevisiae', organism)) %>% 
  select(id) %>%
  as.integer

# Now I want all yeast interactions where a or b is 'CTF4'
example1 <- biogrid(collect = FALSE) %>%
  filter(a == 'CTF4' | b == 'CTF4',
         a_organism == yeastId,
         b_organism == yeastId) %>%
  select(id, a, b) %>%
  collect
example1