require("knitr")
opts_chunk$set(fig.align="center", fig.width=6, fig.height=6, dpi=96)

Overview

Patient-derived cell lines are often used in pre-clinical cancer research, but some cell lines are too different from tumors to be good models. Comparison of genomic and expression profiles can guide the choice of pre-clinical models, but typically not all features are equally relevant. We present TumorComparer, a computational method for comparing cellular profiles with higher weights on functional features of interest. In this pan-cancer application, we compare ∼600 cell lines and ∼8,000 tumor samples of 24 cancer types, using weights to emphasize known oncogenic alterations. We characterize the similarity of cell lines and tumors within and across cancers by using multiple datum types and rank cell lines by their inferred quality as representative models. Beyond the assessment of cell lines, the weighted similarity approach is adaptable to patient stratification in clinical trials and personalized medicine.

Additional details: 10.1016/j.crmeth.2021.100039

Getting started

library(tumorcomparer)

A list of all accessible vignettes and methods is available with the following command.

help.search("tumorcomparer")

Load Data

This example makes use of TCGA Pan-Cancer dataset for rectal adenocarcinoma (READ).

# NOTE: Example files are embedded in the package and are accessible with system.file()
tumor_mut_file <- system.file("extdata", "READ_data_for_running_TC", "tumor_mut.txt", package="tumorcomparer")
tumor_cna_file <- system.file("extdata", "READ_data_for_running_TC", "tumor_cna.txt", package="tumorcomparer")
tumor_exp_file <- system.file("extdata", "READ_data_for_running_TC", "tumor_exp.txt", package="tumorcomparer")

cell_line_mut_file <- system.file("extdata", "READ_data_for_running_TC", "cell_line_mut.txt", package="tumorcomparer")
cell_line_cna_file <- system.file("extdata", "READ_data_for_running_TC", "cell_line_cna.txt", package="tumorcomparer")
cell_line_exp_file <- system.file("extdata", "READ_data_for_running_TC", "cell_line_exp.txt", package="tumorcomparer")

known_cancer_gene_weights_mut_file <- system.file("extdata", "READ_data_for_running_TC", "default_weights_for_known_cancer_genes_mut.txt", package="tumorcomparer")
known_cancer_gene_weights_cna_file <- system.file("extdata", "READ_data_for_running_TC", "default_weights_for_known_cancer_genes_cna.txt", package="tumorcomparer")
known_cancer_gene_weights_exp_file <- system.file("extdata", "READ_data_for_running_TC", "default_weights_for_known_cancer_genes_exp.txt", package="tumorcomparer")

cancer_specific_gene_weights_mut_file <- system.file("extdata", "READ_data_for_running_TC", "Genes_and_weights_mut.txt", package="tumorcomparer")
cancer_specific_gene_weights_cna_file <- system.file("extdata", "READ_data_for_running_TC", "Genes_and_weights_cna.txt", package="tumorcomparer")
cancer_specific_gene_weights_exp_file <- system.file("extdata", "READ_data_for_running_TC", "Genes_and_weights_exp.txt", package="tumorcomparer")

Run Comparison

Each data type is processed separately with its own separate weights and inputs. Currently, there is support for mutation, copy number, expression data types. NOTE: Inclusion of all input data types is not required.

comparison_result <- run_comparison(
               available_data_types=c("mut", "cna", "exp"), 
               mut_data_type_weight = 1/3,
               cna_data_type_weight = 1/3,
               exp_data_type_weight = 1/3,
               cna_default_weight=0.01, 
               mut_default_weight=0.01,
               exp_default_weight=0.01,
               tumor_mut_file=tumor_mut_file, 
               tumor_cna_file=tumor_cna_file, 
               tumor_exp_file=tumor_exp_file, 
               cell_line_mut_file=cell_line_mut_file, 
               cell_line_cna_file=cell_line_cna_file, 
               cell_line_exp_file=cell_line_exp_file, 
               known_cancer_gene_weights_mut_file=known_cancer_gene_weights_mut_file, 
               known_cancer_gene_weights_cna_file=known_cancer_gene_weights_cna_file, 
               known_cancer_gene_weights_exp_file=known_cancer_gene_weights_exp_file, 
               cancer_specific_gene_weights_mut_file=cancer_specific_gene_weights_mut_file, 
               cancer_specific_gene_weights_cna_file=cancer_specific_gene_weights_cna_file, 
               cancer_specific_gene_weights_exp_file=cancer_specific_gene_weights_exp_file)

# See returned outputs
names(comparison_result)

Plot Multidimensional scaling (MDS) Results

The MDS plot shows a two-dimensional representation of distances between cell lines and tumor samples from the input.

NOTE: As with any dimension reduction technique, there can be a loss of information and results should be considered carefully. To aid users, a stress value is calculated to determine the goodness of fit of the two-dimensional representation to from the original data (https://imaging.mrc-cbu.cam.ac.uk/statswiki/FAQ/mds/stress)

Example Stress Values

Stress values are pre-calculated within tumorcomparer

comparison_result$isomdsfit$stress

MDS Plot

p <- plot_mds(comparison_result,
             trim_cell_line_names=FALSE,
             tumor_color="blue",
             cell_line_color="orange",
             tumor_shape=20,
             cell_line_shape=17)
p

Make Balloon Plot

plot_data <- make_balloon_plot_data_from_comparison_result(comparison_result)

p <- plot_balloon_plot(plot_data, "Mean Similarity to Tumors")
p

Distribution Plots

The plot_joyplot function shows the distribution of feature-weighted similarities of cancer cell lines to tumor samples.

plot_joyplot(comparison_result)

Subsetting Pre-Complied TumorComparer Study Data

We publicly make available the data used in the 2021 TumorComparer publication (DOI: 10.1016/j.crmeth.2021.100039). Users can see different examples (i.e., subsets by TCGA sample IDs or custom gene sets) of subsetting this data to run customized analyses. Example code is in the file: "test_tumorcomparer.R" file for the test case: "check_zenodo_usage".

Session Information

sessionInfo()


cannin/tumorcomparer documentation built on Feb. 7, 2023, 3:13 p.m.