gdc_clinical: Get clinical information from GDC

Description Usage Arguments Details Value Examples

View source: R/clinical.R

Description

The NCI GDC has a complex data model that allows various studies to supply numerous clinical and demographic data elements. However, across all projects that enter the GDC, there are similarities. This function returns four data.frames associated with case_ids from the GDC.

Usage

1
gdc_clinical(case_ids, include_list_cols = FALSE)

Arguments

case_ids

a character() vector of case_ids, typically from "cases" query.

include_list_cols

logical(1), whether to include list columns in the "main" data.frame. These list columns have values for aliquots, samples, etc. While these may be useful for some situations, they are generally not that useful as clinical annotations.

Details

Note that these data.frames can, in general, have different numbers of rows (or even no rows at all). If one wishes to combine to produce a single data.frame, using the approach of left joining to the "main" data.frame will yield a useful combined data.frame. We do not do that directly given the potential for 1:many relationships. It is up to the user to determine what the best approach is for any given dataset.

Value

A list of four data.frames:

  1. main, representing basic case identification and metadata (update date, etc.)

  2. diagnoses

  3. esposures

  4. demographic

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
case_ids = cases() %>% results(size=10) %>% ids()
clinical_data = gdc_clinical(case_ids)

# overview of clinical results
class(clinical_data)
names(clinical_data)
sapply(clinical_data, class)
sapply(clinical_data, nrow)

# available data
head(clinical_data$main)
head(clinical_data$demographic)
head(clinical_data$diagnoses)
head(clinical_data$exposures)

GenomicDataCommons documentation built on Nov. 8, 2020, 11:08 p.m.