varselectVenn: Using Venn Diagrams to Compare High-Importance Variables...

Description Usage Arguments Details Author(s) Examples

Description

This function generates a Venn diagram using RNA-seq data from the The Cancer Genome Atlas (TCGA) database. Users specify which cancer types to include (varselectVenn currently supports 2- and 3-set Venn diagrams), as well as the target variable to predict. The user-specified data is processed by a random forest classifier, and the variables (genes) are ranked by their influence on the model’s predictive power. Users specify how many of the high-importance genes to retain, and a Venn diagram is generated that shows which genes are of high-importance among the different cancer types. The function also returns a list object that specifies which genes were retained for each cancer type, as well as which genes were at the intersection of all specified cancers types.

Usage

1
varselectVenn(types, num_var, target)

Arguments

types

A vector of TCGA-supported acronyms that designate the type of cancer. varselectVenn currently supports ACC, BLCA, KIRC, KIRP, LIHC, and THCA. varselectVenn currently supports 2- and 3-set diagrams

num_var

a numeric that specifies how many of the high-importance variables to retain. The random forest classifier will rank each of the 20501 genes features in the order of their impact on the model. Setting num_var to 100 for example, will retain the 100 most important variables for each cancer type to include in the Venn diagram.

target

the variable to be predicted. varselectVenn currently supports tumor “patholigicstage” (which attempts to distinguish stage I tumors from stage II, III, and IV tumors), the patient’s “vitalststus” (a binary for whether the patient is alive or not), and the patient’s “gender”.

Details

Cancer type acronyms: ACC Adrenocortical Carcinoma, BLCA Bladder Urothelial Carcinoma, KIRC Kidney Renal Clear Cell Carcinoma, KIRP Kidney Renal Papillary Cell Carcinoma, LIHC Liver Heptocellular Carcinoma, THCA Thyroid Carcinoma

Author(s)

Jacob Blamer, jwilliamblamer@gmail.com

Examples

1
2
varseleVenn(c("KIRP", "KIRC", "LIHC"), 100, "vitalstatus")
varseleVenn(c("BLCA", "THCA"), 75, "pathologicstage")

jblam251/tcgaRNAML documentation built on June 8, 2019, 2:32 p.m.