ID-translation | R Documentation |
These functions allow the user to enter a character vector of
identifiers and use the GDC API to translate from TCGA barcodes to
Universally Unique Identifiers (UUID) and vice versa. These relationships
are not one-to-one. Therefore, a data.frame
is returned for all
inputs. The UUID to TCGA barcode translation only applies to file and case
UUIDs. Two-way UUID translation is available from 'file_id' to 'case_id'
and vice versa. Please double check any results before using these
features for analysis. Case / submitter identifiers are translated by
default, see the from_type
argument for details. All identifiers are
converted to lower case.
UUIDtoBarcode(id_vector, from_type = c("case_id", "file_id", "aliquot_ids"))
UUIDtoUUID(id_vector, to_type = c("case_id", "file_id"))
barcodeToUUID(barcodes)
filenameToBarcode(filenames, slides = FALSE)
UUIDhistory(id, endpoint = .HISTORY_ENDPOINT)
id_vector |
character() A vector of UUIDs corresponding to either files or cases (default assumes case_ids) |
from_type |
character(1) Either |
to_type |
character(1) The desired UUID type to obtain, can either be
|
barcodes |
character() A vector of TCGA barcodes |
filenames |
|
slides |
|
id |
character(1) A UUID whose history of versions is sought |
endpoint |
character(1) Generally a constant pertaining to the location of the history api endpoint. This argument rarely needs to change. |
Based on the file UUID supplied, the appropriate entity_id (TCGA barcode) is
returned. In previous versions of the package, the 'end_point' parameter
would require the user to specify what type of barcode needed. This is no
longer supported as entity_id
returns the appropriate one.
When providing slide file names, the function will only work if
all the provided files are slide files with an .svs
extension.
Generally, a data.frame
of identifier mappings
UUIDhistory: A data.frame
containting a list of associated UUIDs
for the given input along with file_change
status, data_release
versions, etc.
Sean Davis, M. Ramos
## Translate UUIDs >> TCGA Barcode
uuids <- c("b4bce3ff-7fdc-4849-880b-56f2b348ceac",
"5ca9fa79-53bc-4e91-82cd-5715038ee23e",
"b7c3e5ad-4ffc-4fc4-acbf-1dfcbd2e5382")
UUIDtoBarcode(uuids, from_type = "file_id")
UUIDtoBarcode("ae55b2d3-62a1-419e-9f9a-5ddfac356db4", from_type = "case_id")
UUIDtoBarcode("d85d8a17-8aea-49d3-8a03-8f13141c163b", "aliquot_ids")
## Translate file UUIDs >> case UUIDs
uuids <- c("b4bce3ff-7fdc-4849-880b-56f2b348ceac",
"5ca9fa79-53bc-4e91-82cd-5715038ee23e",
"b7c3e5ad-4ffc-4fc4-acbf-1dfcbd2e5382")
UUIDtoUUID(uuids)
## Translate TCGA Barcode >> UUIDs
fullBarcodes <- c("TCGA-B0-5117-11A-01D-1421-08",
"TCGA-B0-5094-11A-01D-1421-08",
"TCGA-E9-A295-10A-01D-A16D-09")
sample_ids <- TCGAbarcode(fullBarcodes, sample = TRUE)
barcodeToUUID(sample_ids)
participant_ids <- c("TCGA-CK-4948", "TCGA-D1-A17N",
"TCGA-4V-A9QX", "TCGA-4V-A9QM")
barcodeToUUID(participant_ids)
library(GenomicDataCommons)
### Query CNV data and get file names
cnv <- files() |>
filter(
~ cases.project.project_id == "TCGA-COAD" &
data_category == "Copy Number Variation" &
data_type == "Copy Number Segment"
) |>
results(size = 6)
filenameToBarcode(cnv$file_name)
### Query slides data and get file names
slides <- files() |>
filter(
~ cases.project.project_id == "TCGA-BRCA" &
cases.samples.sample_type == "Primary Tumor" &
data_type == "Slide Image" &
experimental_strategy == "Diagnostic Slide"
) |>
results(size = 3)
filenameToBarcode(slides$file_name, slides = TRUE)
## Get the version history of a BAM file in TCGA-KIRC
UUIDhistory("0001801b-54b0-4551-8d7a-d66fb59429bf")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.