hca-api: HCA API methods

Description Usage Arguments Value Author(s) Examples

Description

Methods to access the Human Cell Atlas's Data Coordination Platform (HCA DCP) using the platform's REST API.

checkoutBundle() initiates the 'checkout' process from the HCA DCP DSS.

getBundleCheckout() queries the status of a bundle checkout request

getFile() retrieves a file from its UUID.

'headFile()' retrieves metadata about a file from its UUID.

Usage

1
2
3
4
5
6
7
checkoutBundle(x, uuid, version = NULL, replica = "aws")

getBundleCheckout(x, checkout_job_id, replica = "aws")

getFile(x, uuid, version = NULL, replica = "aws", destination = tempfile())

headFile(x, uuid, version = NULL, replica = "aws")

Arguments

x

An HCABrowser object that is the subject of the request.

uuid

character(1). A RFC4122-compliant ID for the bundle.

version

character(1). Timestamp of bundle creation in RFC3339.

replica

character(1). A replica to fetch form. Can be one of "aws", "gcp", or "azure". Default is "aws".

checkout_job_id

character(1). A RFC4122-complliant ID for the checkout job request.

destination

character(1) path to downloaded file. The path cannot exist.

Value

checkoutBundle() returns a character(1) identifier to be used as the checkout_job_id= to determine status of the checkout using getBundleCheckout().

getBundleCheckout() returns a list. One component of the list is status=. If the value is SUCCEEDED, then the list contains a second element location= containing a URL to the location of the checkout, e.g., an s3 bucket.

getFile() returns the path to the downloaded file.

'headFile()' returns a tibble of technical metadata, including file size and content type (e.g., 'application/gzip'), about a file.

Author(s)

Daniel Van Twisk

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
hca <-
    HCABrowser() %>%
    filter('files.specimen_from_organism_json.organ.text' == "brain")
hca

result <-
    hca %>%
    searchBundles(per_page = 10L, output_format = "raw")
result

tbl <-
   results(result)[[1]]$metadata$manifest$files %>%
   bind_rows() %>%
   mutate(`content-type` = noquote(`content-type`))
tbl


re <- "^([^\\.]+)\\.(.*)$" # uuid / version as before / after the first '.'

uuid <-
    hca %>%
    searchBundles(per_page = 10L, output_format = "summary") %>%
    as_tibble() %>%
    mutate(
       uuid = sub(re, "\\1", bundle_fqid),
       version = sub(re, "\\2", bundle_fqid)
   ) %>%
   pull(uuid)
uuid

checkout_job_id <- checkoutBundle(hca, uuid[1])
checkout_job_id


getBundleCheckout(hca, checkout_job_id)


fastq <-
   tbl %>%
   filter(endsWith(name, "fastq.gz")) %>%
   select(name, uuid)
fastq

uuid <- pull(fastq) %>% tail(1)
uuid

## Not run: 
destination <- getFile(hca, uuid)
readLines(destination, 4L)

## End(Not run)


headFile(hca, uuid)

Bioconductor/HCABrowser documentation built on Feb. 10, 2021, 12:51 p.m.