gdc_rnaseq: Get RNA-seq quantification from the NCI GDC.

Description Usage Arguments Details Value Functions References Examples

View source: R/gdc_rnaseq.R

Description

gdc_rnaseq is a high-level function for accessing the NCI GDC RNA-seq data and summarizing as a SummarizedExperiment.

Usage

1
2
3
available_rnaseq_workflows()

gdc_rnaseq(project_id, workflow_type)

Arguments

project_id

character() vector with one or more project ids. Available project_ids can be found using ids(projects()). Note that not all projects contain RNA-seq data.

workflow_type

character(1) with the workflow type. Possible values can be accessed using available_rnaseq_workflows

Details

The RNA-seq data are downloaded using gdcdata with caching used as available. The resulting files are read and combined without any transformation. It us up to the user to perform further normalization or transformation if needed.

Clinical information for each file (see gdc_clinical for details) is loaded into the colData slot. Quality control mapping information is also stored in the colData with column names beginning with "qc__".

Value

a SummarizedExperiment object, populated with the expression values, the gene ids in the rowData, and the clinical data associated with each sample in the colData.

Functions

References

See https://docs.gdc.cancer.gov/Data/Bioinformatics_Pipelines/Expression_mRNA_Pipeline/ for details of data processing that occurs at the GDC.

Examples

1
2
3
4
5
6
7
available_rnaseq_workflows()

## Not run: 
tcga_se = gdc_rnaseq('TCGA-ACC', 'HTSeq - Counts')
tcga_se

## End(Not run)

GenomicDataCommons documentation built on Nov. 8, 2020, 11:08 p.m.