tcga_remove_duplicated_samples: remove duplicated samples in TCGA

View source: R/TCGA.R

tcga_remove_duplicated_samplesR Documentation

remove duplicated samples in TCGA

Description

remove duplicated samples in TCGA based on Firehose principle

Usage

tcga_remove_duplicated_samples(barcode)

Arguments

barcode

a character vector gives barcode of TCGA

Details

In many instances there is more than one aliquot for a given combination of individual, platform, and data type. However, only one aliquot may be ingested into Firehose. Therefore, a set of precedence rules are applied to select the most scientifically advantageous one among them.

The following precedence rules are applied when the aliquots have differing analytes. For RNA aliquots, T analytes are dropped in preference to H and R analytes, since T is the inferior extraction protocol. If H and R are encountered, H is the chosen analyte. This is somewhat arbitrary and subject to change, since it is not clear at present whether H or R is the better protocol. If there are multiple aliquots associated with the chosen RNA analyte, the aliquot with the later plate number is chosen. For DNA aliquots, D analytes (native DNA) are preferred over G, W, or X (whole-genome amplified) analytes, unless the G, W, or X analyte sample has a higher plate number.

Value

a tibble with duplicated barcode removed

Author(s)

Yun yunyunpp96@outlook.com

References

Replicate Samples - GDAC Firehose


Yunuuuu/yjtools documentation built on Jan. 29, 2024, 5:30 a.m.