require("knitr") opts_chunk$set(fig.align="center", fig.width=6, fig.height=6, dpi=96)
Patient-derived cell lines are often used in pre-clinical cancer research, but some cell lines are too different from tumors to be good models. Comparison of genomic and expression profiles can guide the choice of pre-clinical models, but typically not all features are equally relevant. We present TumorComparer, a computational method for comparing cellular profiles with higher weights on functional features of interest. In this pan-cancer application, we compare ∼600 cell lines and ∼8,000 tumor samples of 24 cancer types, using weights to emphasize known oncogenic alterations. We characterize the similarity of cell lines and tumors within and across cancers by using multiple datum types and rank cell lines by their inferred quality as representative models. Beyond the assessment of cell lines, the weighted similarity approach is adaptable to patient stratification in clinical trials and personalized medicine.
Additional details: 10.1016/j.crmeth.2021.100039
library(tumorcomparer)
A list of all accessible vignettes and methods is available with the following command.
help.search("tumorcomparer")
This example makes use of TCGA Pan-Cancer dataset for rectal adenocarcinoma (READ).
# NOTE: Example files are embedded in the package and are accessible with system.file() tumor_mut_file <- system.file("extdata", "READ_data_for_running_TC", "tumor_mut.txt", package="tumorcomparer") tumor_cna_file <- system.file("extdata", "READ_data_for_running_TC", "tumor_cna.txt", package="tumorcomparer") tumor_exp_file <- system.file("extdata", "READ_data_for_running_TC", "tumor_exp.txt", package="tumorcomparer") cell_line_mut_file <- system.file("extdata", "READ_data_for_running_TC", "cell_line_mut.txt", package="tumorcomparer") cell_line_cna_file <- system.file("extdata", "READ_data_for_running_TC", "cell_line_cna.txt", package="tumorcomparer") cell_line_exp_file <- system.file("extdata", "READ_data_for_running_TC", "cell_line_exp.txt", package="tumorcomparer") known_cancer_gene_weights_mut_file <- system.file("extdata", "READ_data_for_running_TC", "default_weights_for_known_cancer_genes_mut.txt", package="tumorcomparer") known_cancer_gene_weights_cna_file <- system.file("extdata", "READ_data_for_running_TC", "default_weights_for_known_cancer_genes_cna.txt", package="tumorcomparer") known_cancer_gene_weights_exp_file <- system.file("extdata", "READ_data_for_running_TC", "default_weights_for_known_cancer_genes_exp.txt", package="tumorcomparer") cancer_specific_gene_weights_mut_file <- system.file("extdata", "READ_data_for_running_TC", "Genes_and_weights_mut.txt", package="tumorcomparer") cancer_specific_gene_weights_cna_file <- system.file("extdata", "READ_data_for_running_TC", "Genes_and_weights_cna.txt", package="tumorcomparer") cancer_specific_gene_weights_exp_file <- system.file("extdata", "READ_data_for_running_TC", "Genes_and_weights_exp.txt", package="tumorcomparer")
Each data type is processed separately with its own separate weights and inputs. Currently, there is support for mutation, copy number, expression data types. NOTE: Inclusion of all input data types is not required.
comparison_result <- run_comparison( available_data_types=c("mut", "cna", "exp"), mut_data_type_weight = 1/3, cna_data_type_weight = 1/3, exp_data_type_weight = 1/3, cna_default_weight=0.01, mut_default_weight=0.01, exp_default_weight=0.01, tumor_mut_file=tumor_mut_file, tumor_cna_file=tumor_cna_file, tumor_exp_file=tumor_exp_file, cell_line_mut_file=cell_line_mut_file, cell_line_cna_file=cell_line_cna_file, cell_line_exp_file=cell_line_exp_file, known_cancer_gene_weights_mut_file=known_cancer_gene_weights_mut_file, known_cancer_gene_weights_cna_file=known_cancer_gene_weights_cna_file, known_cancer_gene_weights_exp_file=known_cancer_gene_weights_exp_file, cancer_specific_gene_weights_mut_file=cancer_specific_gene_weights_mut_file, cancer_specific_gene_weights_cna_file=cancer_specific_gene_weights_cna_file, cancer_specific_gene_weights_exp_file=cancer_specific_gene_weights_exp_file) # See returned outputs names(comparison_result)
The MDS plot shows a two-dimensional representation of distances between cell lines and tumor samples from the input.
NOTE: As with any dimension reduction technique, there can be a loss of information and results should be considered carefully. To aid users, a stress value is calculated to determine the goodness of fit of the two-dimensional representation to from the original data (https://imaging.mrc-cbu.cam.ac.uk/statswiki/FAQ/mds/stress)
Stress values are pre-calculated within tumorcomparer
comparison_result$isomdsfit$stress
p <- plot_mds(comparison_result, trim_cell_line_names=FALSE, tumor_color="blue", cell_line_color="orange", tumor_shape=20, cell_line_shape=17) p
plot_data <- make_balloon_plot_data_from_comparison_result(comparison_result) p <- plot_balloon_plot(plot_data, "Mean Similarity to Tumors") p
The plot_joyplot
function shows the distribution of feature-weighted similarities of cancer cell lines to tumor samples.
plot_joyplot(comparison_result)
We publicly make available the data used in the 2021 TumorComparer publication (DOI: 10.1016/j.crmeth.2021.100039). Users can see different examples (i.e., subsets by TCGA sample IDs or custom gene sets) of subsetting this data to run customized analyses. Example code is in the file: "test_tumorcomparer.R" file for the test case: "check_zenodo_usage".
sessionInfo()
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.