requireNamespace("knitr")
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

In this example, we're going to demonstrate the result persistance feature in the Query API. Query results can be saved to a file on the server, usually under the user_data/ directory, instead of loading them into memory on the client. This is useful e.g. when you're preparing case-control files or covariates for use as input into regression workflows, e.g. PLINK.

Load packages

First load the gorr package, the tidyverse package is recommended, but for the sake of simplicity we pick out the ones we're using:

library(gorr)
library(magrittr) # pipe
library(dplyr)

Next we make a conn object for holding information on the API we're connecting to. platform_connect takes 2 parameters api_key and project but if either are left out then it will try to read the environment variables GOR_API_KEY, and GOR_API_PROJECT respectively. Here below we have the GOR_API_KEY environment variable already defined so supplying the function only with a target project suffices. After this we create a query function/closure so we don't have to reference conn again:

conn <- platform_connect(project = "ukbb_hg38")
query <- gor_create(conn = conn)

Now we can call that function with our query, along with a persist parameter for storing the results on the server. Let's make a .tsv file of the first 1000 SNPs in dbSNP:

query("gor #dbsnp# | top 1000", persist = "user_data/doc/dbsnp_1000.gorz")
query("gor #dbsnp# | top 1000", persist = "user_data/doc/dbsnp_1000.tsv")

Note that in this example, for the sake of demonstrating two different file types, we make two calls. One creates a GORz file (compressed in genomic order), the other a simple tsv file. Now we can read the file contents directly:

query("gor user_data/doc/dbsnp_1000.gorz | top 10")

For reading the tsv file, we use NOR which is suitable for reading any tab-separated data-files and don't assume genomic order:

query("nor user_data/doc/dbsnp_1000.tsv | top 10")

It is sometimes useful to list the files of a given directory, we can use nor to do that:

query("nor user_data/doc/") %>%
  transmute(Filename, Filesize = fs::as_fs_bytes(Filesize), Filetype)

The table above also shows the size difference between the gorz and the tsv file



wuxi-nextcode/gorr documentation built on Jan. 1, 2023, 7:54 a.m.