ID-translation: Translate study identifiers from barcode to UUID and vice...

Description Usage Arguments Details Value Author(s) Examples

Description

These functions allow the user to enter a character vector of identifiers and use the GDC API to translate from TCGA barcodes to Universally Unique Identifiers (UUID) and vice versa. These relationships are not one-to-one. Therefore, a data.frame is returned for all inputs. The UUID to TCGA barcode translation only applies to file and case UUIDs. Two-way UUID translation is available from 'file_id' to 'case_id' and vice versa. Please double check any results before using these features for analysis. Case / submitter identifiers are translated by default, see the from_type argument for details. All identifiers are converted to lower case.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
UUIDtoBarcode(
  id_vector,
  from_type = c("case_id", "file_id", "aliquot_ids"),
  legacy = FALSE
)

UUIDtoUUID(id_vector, to_type = c("case_id", "file_id"), legacy = FALSE)

barcodeToUUID(barcodes, legacy = FALSE)

filenameToBarcode(filenames, legacy = FALSE)

Arguments

id_vector

A character vector of UUIDs corresponding to either files or cases (default assumes case_ids)

from_type

Either case_id or file_id indicating the type of id_vector entered (default "case_id")

legacy

(logical default FALSE) whether to search the legacy archives

to_type

The desired UUID type to obtain, can either be "case_id" or "file_id"

barcodes

A character vector of TCGA barcodes

filenames

A character vector of filenames usually obtained from the GenomicDataCommons

Details

Based on the file UUID supplied, the appropriate entity_id (TCGA barcode) is returned. In previous versions of the package, the 'end_point' parameter would require the user to specify what type of barcode needed. This is no longer supported as 'entity_id' returns the appropriate one.

Value

A data.frame of TCGA barcode identifiers and UUIDs

Author(s)

Sean Davis, M. Ramos

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
## Translate UUIDs >> TCGA Barcode

uuids <- c("0001801b-54b0-4551-8d7a-d66fb59429bf",
"002c67f2-ff52-4246-9d65-a3f69df6789e",
"003143c8-bbbf-46b9-a96f-f58530f4bb82")

UUIDtoBarcode(uuids, from_type = "file_id")

UUIDtoBarcode("ae55b2d3-62a1-419e-9f9a-5ddfac356db4", from_type = "case_id")

UUIDtoBarcode("d85d8a17-8aea-49d3-8a03-8f13141c163b", "aliquot_ids")

## Translate file UUIDs >> case UUIDs

uuids <- c("0001801b-54b0-4551-8d7a-d66fb59429bf",
"002c67f2-ff52-4246-9d65-a3f69df6789e",
"003143c8-bbbf-46b9-a96f-f58530f4bb82")

UUIDtoUUID(uuids)

## Translate TCGA Barcode >> UUIDs

fullBarcodes <- c("TCGA-B0-5117-11A-01D-1421-08",
"TCGA-B0-5094-11A-01D-1421-08",
"TCGA-E9-A295-10A-01D-A16D-09")

sample_ids <- TCGAbarcode(fullBarcodes, sample = TRUE)

barcodeToUUID(sample_ids)

participant_ids <- c("TCGA-CK-4948", "TCGA-D1-A17N",
"TCGA-4V-A9QX", "TCGA-4V-A9QM")

barcodeToUUID(participant_ids)

library(GenomicDataCommons)

fquery <- files() %>%
    filter(~ cases.project.project_id == "TCGA-COAD" &
        data_category == "Copy Number Variation" &
        data_type == "Copy Number Segment")

fnames <- results(fquery)$file_name[1:6]

filenameToBarcode(fnames)

TCGAutils documentation built on April 17, 2021, 6:04 p.m.