In cannin/tumorcomparer: Compare Patient Samples to Cell Line Models Using Molecular Data and Weighted Similarity

require("knitr")
opts_chunk$set(fig.align="center", fig.width=6, fig.height=6, dpi=96)

Overview

Patient-derived cell lines are often used in pre-clinical cancer research, but some cell lines are too different from tumors to be good models. Comparison of genomic and expression profiles can guide the choice of pre-clinical models, but typically not all features are equally relevant. We present TumorComparer, a computational method for comparing cellular profiles with higher weights on functional features of interest. In this pan-cancer application, we compare ∼600 cell lines and ∼8,000 tumor samples of 24 cancer types, using weights to emphasize known oncogenic alterations. We characterize the similarity of cell lines and tumors within and across cancers by using multiple datum types and rank cell lines by their inferred quality as representative models. Beyond the assessment of cell lines, the weighted similarity approach is adaptable to patient stratification in clinical trials and personalized medicine.

Additional details: 10.1016/j.crmeth.2021.100039

Getting started

library(tumorcomparer)

A list of all accessible vignettes and methods is available with the following command.

help.search("tumorcomparer")

Load Data

This example makes use of TCGA Pan-Cancer dataset for rectal adenocarcinoma (READ).

# NOTE: Example files are embedded in the package and are accessible with system.file()
tumor_mut_file <- system.file("extdata", "READ_data_for_running_TC", "tumor_mut.txt", package="tumorcomparer")
tumor_cna_file <- system.file("extdata", "READ_data_for_running_TC", "tumor_cna.txt", package="tumorcomparer")
tumor_exp_file <- system.file("extdata", "READ_data_for_running_TC", "tumor_exp.txt", package="tumorcomparer")

cell_line_mut_file <- system.file("extdata", "READ_data_for_running_TC", "cell_line_mut.txt", package="tumorcomparer")
cell_line_cna_file <- system.file("extdata", "READ_data_for_running_TC", "cell_line_cna.txt", package="tumorcomparer")
cell_line_exp_file <- system.file("extdata", "READ_data_for_running_TC", "cell_line_exp.txt", package="tumorcomparer")

known_cancer_gene_weights_mut_file <- system.file("extdata", "READ_data_for_running_TC", "default_weights_for_known_cancer_genes_mut.txt", package="tumorcomparer")
known_cancer_gene_weights_cna_file <- system.file("extdata", "READ_data_for_running_TC", "default_weights_for_known_cancer_genes_cna.txt", package="tumorcomparer")
known_cancer_gene_weights_exp_file <- system.file("extdata", "READ_data_for_running_TC", "default_weights_for_known_cancer_genes_exp.txt", package="tumorcomparer")

cancer_specific_gene_weights_mut_file <- system.file("extdata", "READ_data_for_running_TC", "Genes_and_weights_mut.txt", package="tumorcomparer")
cancer_specific_gene_weights_cna_file <- system.file("extdata", "READ_data_for_running_TC", "Genes_and_weights_cna.txt", package="tumorcomparer")
cancer_specific_gene_weights_exp_file <- system.file("extdata", "READ_data_for_running_TC", "Genes_and_weights_exp.txt", package="tumorcomparer")

Run Comparison

Each data type is processed separately with its own separate weights and inputs. Currently, there is support for mutation, copy number, expression data types. NOTE: Inclusion of all input data types is not required.

comparison_result <- run_comparison(
               available_data_types=c("mut", "cna", "exp"), 
               mut_data_type_weight = 1/3,
               cna_data_type_weight = 1/3,
               exp_data_type_weight = 1/3,
               cna_default_weight=0.01, 
               mut_default_weight=0.01,
               exp_default_weight=0.01,
               tumor_mut_file=tumor_mut_file, 
               tumor_cna_file=tumor_cna_file, 
               tumor_exp_file=tumor_exp_file, 
               cell_line_mut_file=cell_line_mut_file, 
               cell_line_cna_file=cell_line_cna_file, 
               cell_line_exp_file=cell_line_exp_file, 
               known_cancer_gene_weights_mut_file=known_cancer_gene_weights_mut_file, 
               known_cancer_gene_weights_cna_file=known_cancer_gene_weights_cna_file, 
               known_cancer_gene_weights_exp_file=known_cancer_gene_weights_exp_file, 
               cancer_specific_gene_weights_mut_file=cancer_specific_gene_weights_mut_file, 
               cancer_specific_gene_weights_cna_file=cancer_specific_gene_weights_cna_file, 
               cancer_specific_gene_weights_exp_file=cancer_specific_gene_weights_exp_file)

# See returned outputs
names(comparison_result)

Plot Multidimensional scaling (MDS) Results

The MDS plot shows a two-dimensional representation of distances between cell lines and tumor samples from the input.

NOTE: As with any dimension reduction technique, there can be a loss of information and results should be considered carefully. To aid users, a stress value is calculated to determine the goodness of fit of the two-dimensional representation to from the original data (https://imaging.mrc-cbu.cam.ac.uk/statswiki/FAQ/mds/stress)

20% or above: Very Poor
10%-19.9%: Fair
5%-9.9%: Good
2.5%-4.9%: Excellent
0%-2.4%: Near Perfect Fit

Example Stress Values

Stress values are pre-calculated within tumorcomparer

comparison_result$isomdsfit$stress

MDS Plot

p <- plot_mds(comparison_result,
             trim_cell_line_names=FALSE,
             tumor_color="blue",
             cell_line_color="orange",
             tumor_shape=20,
             cell_line_shape=17)
p

Make Balloon Plot

plot_data <- make_balloon_plot_data_from_comparison_result(comparison_result)

p <- plot_balloon_plot(plot_data, "Mean Similarity to Tumors")
p

Distribution Plots

The plot_joyplot function shows the distribution of feature-weighted similarities of cancer cell lines to tumor samples.

plot_joyplot(comparison_result)

Subsetting Pre-Complied TumorComparer Study Data

We publicly make available the data used in the 2021 TumorComparer publication (DOI: 10.1016/j.crmeth.2021.100039). Users can see different examples (i.e., subsets by TCGA sample IDs or custom gene sets) of subsetting this data to run customized analyses. Example code is in the file: "test_tumorcomparer.R" file for the test case: "check_zenodo_usage".

Session Information

sessionInfo()

cannin/tumorcomparer documentation built on Feb. 7, 2023, 3:13 p.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

cannin/tumorcomparer
Compare Patient Samples to Cell Line Models Using Molecular Data and Weighted Similarity

In cannin/tumorcomparer: Compare Patient Samples to Cell Line Models Using Molecular Data and Weighted Similarity

Overview

Getting started

Load Data

Run Comparison

Plot Multidimensional scaling (MDS) Results

Example Stress Values

MDS Plot

Make Balloon Plot

Distribution Plots

Subsetting Pre-Complied TumorComparer Study Data

Session Information

R Package Documentation

Browse R Packages

We want your feedback!

cannin/tumorcomparer Compare Patient Samples to Cell Line Models Using Molecular Data and Weighted Similarity

In cannin/tumorcomparer: Compare Patient Samples to Cell Line Models Using Molecular Data and Weighted Similarity

Overview

Getting started

Load Data

Run Comparison

Plot Multidimensional scaling (MDS) Results

Example Stress Values

MDS Plot

Make Balloon Plot

Distribution Plots

Subsetting Pre-Complied TumorComparer Study Data

Session Information

R Package Documentation

Browse R Packages

We want your feedback!

cannin/tumorcomparer
Compare Patient Samples to Cell Line Models Using Molecular Data and Weighted Similarity