gdc_clinical: Get clinical information from GDC

View source: R/clinical.R

gdc_clinicalR Documentation

Get clinical information from GDC

Description

The NCI GDC has a complex data model that allows various studies to supply numerous clinical and demographic data elements. However, across all projects that enter the GDC, there are similarities. This function returns four data.frames associated with case_ids from the GDC.

Usage

gdc_clinical(case_ids, include_list_cols = FALSE)

Arguments

case_ids

a character() vector of case_ids, typically from "cases" query.

include_list_cols

logical(1), whether to include list columns in the "main" data.frame. These list columns have values for aliquots, samples, etc. While these may be useful for some situations, they are generally not that useful as clinical annotations.

Details

Note that these data.frames can, in general, have different numbers of rows (or even no rows at all). If one wishes to combine to produce a single data.frame, using the approach of left joining to the "main" data.frame will yield a useful combined data.frame. We do not do that directly given the potential for 1:many relationships. It is up to the user to determine what the best approach is for any given dataset.

Value

A list of four data.frames:

  1. main, representing basic case identification and metadata (update date, etc.)

  2. diagnoses

  3. esposures

  4. demographic

Examples

case_ids = cases() |> results(size=10) |> ids()
clinical_data = gdc_clinical(case_ids)

# overview of clinical results
class(clinical_data)
names(clinical_data)
sapply(clinical_data, class)
sapply(clinical_data, nrow)

# available data
head(clinical_data$main)
head(clinical_data$demographic)
head(clinical_data$diagnoses)
head(clinical_data$exposures)


Bioconductor/GenomicDataCommons documentation built on Oct. 31, 2024, 7 a.m.