GDCquery: Query GDC data

Description Usage Arguments Value Examples

View source: R/query.R

Description

Uses GDC API to search for search, it searches for both controlled and open-access data. For GDC data arguments project, data.category, data.type and workflow.type should be used For the legacy data arguments project, data.category, platform and/or file.extension should be used. Please, see the vignette for a table with the possibilities.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
GDCquery(
  project,
  data.category,
  data.type,
  workflow.type,
  legacy = FALSE,
  access,
  platform,
  file.type,
  barcode,
  data.format,
  experimental.strategy,
  sample.type
)

Arguments

project

A list of valid project (see list with TCGAbiolinks:::getGDCprojects()$project_id)]

  • BEATAML1.0-COHORT

  • BEATAML1.0-CRENOLANIB

  • CGCI-BLGSP

  • CPTAC-2

  • CPTAC-3

  • CTSP-DLBCL1

  • FM-AD

  • HCMI-CMDC

  • MMRF-COMMPASS

  • NCICCR-DLBCL

  • OHSU-CNL

  • ORGANOID-PANCREATIC

  • TARGET-ALL-P1

  • TARGET-ALL-P2

  • TARGET-ALL-P3

  • TARGET-AML

  • TARGET-CCSK

  • TARGET-NBL

  • TARGET-OS

  • TARGET-RT

  • TARGET-WT

  • TCGA-ACC

  • TCGA-BLCA

  • TCGA-BRCA

  • TCGA-CESC

  • TCGA-CHOL

  • TCGA-COAD

  • TCGA-DLBC

  • TCGA-ESCA

  • TCGA-GBM

  • TCGA-HNSC

  • TCGA-KICH

  • TCGA-KIRC

  • TCGA-KIRP

  • TCGA-LAML

  • TCGA-LGG

  • TCGA-LIHC

  • TCGA-LUAD

  • TCGA-LUSC

  • TCGA-MESO

  • TCGA-OV

  • TCGA-PAAD

  • TCGA-PCPG

  • TCGA-PRAD

  • TCGA-READ

  • TCGA-SARC

  • TCGA-SKCM

  • TCGA-STAD

  • TCGA-TGCT

  • TCGA-THCA

  • TCGA-THYM

  • TCGA-UCEC

  • TCGA-UCS

  • TCGA-UVM

  • VAREPOP-APOLLO

data.category

A valid project (see list with TCGAbiolinks:::getProjectSummary(project)) For the complete list please check the vignette. List for harmonized database:

  • Biospecimen

  • Clinical

  • Copy Number Variation

  • DNA Methylation

  • Sequencing Reads

  • Simple Nucleotide Variation

  • Transcriptome Profiling

List for legacy archive

  • Biospecimen

  • Clinical

  • Copy number variation

  • DNA methylation

  • Gene expression

  • Protein expression

  • Raw microarray data

  • Raw sequencing data

  • Simple nucleotide variation

data.type

A data type to filter the files to download For the complete list please check the vignette.

workflow.type

GDC workflow type

legacy

Search in the legacy repository

access

Filter by access type. Possible values: controlled, open

platform

Example:

CGH- 1x1M_G4447A IlluminaGA_RNASeqV2
AgilentG4502A_07 IlluminaGA_mRNA_DGE
Human1MDuo HumanMethylation450
HG-CGH-415K_G4124A IlluminaGA_miRNASeq
HumanHap550 IlluminaHiSeq_miRNASeq
ABI H-miRNA_8x15K
HG-CGH-244A SOLiD_DNASeq
IlluminaDNAMethylation_OMA003_CPI IlluminaGA_DNASeq_automated
IlluminaDNAMethylation_OMA002_CPI HG-U133_Plus_2
HuEx- 1_0-st-v2 Mixed_DNASeq
H-miRNA_8x15Kv2 IlluminaGA_DNASeq_curated
MDA_RPPA_Core IlluminaHiSeq_TotalRNASeqV2
HT_HG-U133A IlluminaHiSeq_DNASeq_automated
diagnostic_images microsat_i
IlluminaHiSeq_RNASeq SOLiD_DNASeq_curated
IlluminaHiSeq_DNASeqC Mixed_DNASeq_curated
IlluminaGA_RNASeq IlluminaGA_DNASeq_Cont_automated
IlluminaGA_DNASeq IlluminaHiSeq_WGBS
pathology_reports IlluminaHiSeq_DNASeq_Cont_automated
Genome_Wide_SNP_6 bio
tissue_images Mixed_DNASeq_automated
HumanMethylation27 Mixed_DNASeq_Cont_curated
IlluminaHiSeq_RNASeqV2 Mixed_DNASeq_Cont
file.type

To be used in the legacy database for some platforms, to define which file types to be used.

barcode

A list of barcodes to filter the files to download

data.format

Data format filter ("VCF", "TXT", "BAM","SVS","BCR XML","BCR SSF XML", "TSV", "BCR Auxiliary XML", "BCR OMF XML", "BCR Biotab", "MAF", "BCR PPS XML", "XLSX")

experimental.strategy

Filter to experimental strategy. Harmonized: WXS, RNA-Seq, miRNA-Seq, Genotyping Array. Legacy: WXS, RNA-Seq, miRNA-Seq, Genotyping Array, DNA-Seq, Methylation array, Protein expression array, WXS,CGH array, VALIDATION, Gene expression array,WGS, MSI-Mono-Dinucleotide Assay, miRNA expression array, Mixed strategies, AMPLICON, Exon array, Total RNA-Seq, Capillary sequencing, Bisulfite-Seq

sample.type

A sample type to filter the files to download

Value

A data frame with the results and the parameters used

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
query <- GDCquery(project = "TCGA-ACC",
                  data.category = "Copy Number Variation",
                  data.type = "Copy Number Segment")
## Not run: 
query <- GDCquery(project = "TARGET-AML",
                  data.category = "Transcriptome Profiling",
                  data.type = "miRNA Expression Quantification",
                  workflow.type = "BCGSC miRNA Profiling",
                  barcode = c("TARGET-20-PARUDL-03A-01R","TARGET-20-PASRRB-03A-01R"))
query <- GDCquery(project = "TARGET-AML",
                  data.category = "Transcriptome Profiling",
                  data.type = "Gene Expression Quantification",
                  workflow.type = "HTSeq - Counts",
                  barcode = c("TARGET-20-PADZCG-04A-01R","TARGET-20-PARJCR-09A-01R"))
query <- GDCquery(project = "TCGA-ACC",
                  data.category =  "Copy Number Variation",
                  data.type = "Masked Copy Number Segment",
                  sample.type = c("Primary Tumor"))
query.met <- GDCquery(project = c("TCGA-GBM","TCGA-LGG"),
                      legacy = TRUE,
                      data.category = "DNA methylation",
                      platform = "Illumina Human Methylation 450")
query <- GDCquery(project = "TCGA-ACC",
                  data.category =  "Copy number variation",
                  legacy = TRUE,
                  file.type = "hg19.seg",
                  barcode = c("TCGA-OR-A5LR-01A-11D-A29H-01"))

## End(Not run)

TCGAbiolinks documentation built on Nov. 8, 2020, 5:37 p.m.