run_comparison_config_list: Run a comparison between two cohorts (e.g. cell lines and...

View source: R/run_comparison_config_list.R

run_comparison_config_listR Documentation

Run a comparison between two cohorts (e.g. cell lines and tumors) based on specified data types by configuration list

Description

Run a comparison between two cohorts (e.g. cell lines and tumors) based on specified data types by configuration list

Usage

run_comparison_config_list(
  config_list,
  gene_list = NULL,
  remove_errored_dataset_comparisons = FALSE,
  run_mds = TRUE,
  verbose = FALSE
)

Arguments

config_list

list which sepcifies comparison datasets and their parameters for each data type (mutation, CNV, expression, etc.). The structure of the config_list is list(mut = list(), cnv = list(), exp = list(), ...). The list of each dataset must contain the following arguments list(dataset_name, data_type_weight, default_weight, tumor_file, cell_line_file)

  • dataset_name: short name of comparison data type

  • data_type_weight: a numeric weight for the data type (NOTE: data type weights must sum to 1)

  • default_weight: default (background) weight for specified data type (EXP) (DEFAULT: 0.01)

  • tumor_file: path to a tab delimited table file with HGNC gene symbol rownames and columns as tumour samples

  • cell_line_file: path to a tab delimited table file with HGNC gene symbol rownames and columns as cell lines

  • known_cancer_gene_weights_file: path to a file with weights for genes known to be recurrently altered in cancer (e.g. recurrently mutated genes in TCGA pan-cancer analyses). A two-column tab-delimited file - the first column has the gene names and the second column specifies the weights (Default: NULL)

  • cancer_specific_gene_weights_file: path to a file with weights for cancer-specific set of recurrently altered genes. A tab-delimited file - the first column has the gene names, and the second column specifies the weights (Default: NULL)

gene_list

a vector of HGNC gene symbols to run comparison only for the specified genes (Default: NULL)

remove_errored_dataset_comparisons

will skip the data types which cannot be compared for technical reasons (not enough genes to compare or data contain only 0 values) when set to TRUE (Default: FALSE)

run_mds

a boolean, whether to run multidimensional scaling (MDS) on dataset (Default: TRUE)

verbose

show debugging information

Value

a list with multiple items. Each

  • dist_mat: a matrix of combined pairwise distances for all data types

  • dist_mat_by_data_type: a list of pairwise distances for each data type

  • isomdsfit: a two-column (2-dimension) fitting of the distances reduced to two dimensions via MDS - multidimensional scaling, using the isoMDS function for all data types

  • isomdsfit_by_data_type: a two-column (2-dimension) fitting of the distances reduced to two dimensions via MDS - multidimensional scaling, using the isoMDS function for each data type

  • composite_mat: the composite matrix (see Details)

  • cell_line_ids: a vector of cell line IDs/names with all data types

  • tumor_ids: a vector of tumor IDs with all data types

  • calculated_data_types: chacrachter vector of the dataset names which were analysed by comparison function

Examples

tumor_mut_file <- system.file("extdata", "READ_data_for_running_TC", "tumor_mut.txt", 
  package="tumorcomparer")
tumor_cna_file <- system.file("extdata", "READ_data_for_running_TC", "tumor_cna.txt", 
  package="tumorcomparer")
tumor_exp_file <- system.file("extdata", "READ_data_for_running_TC", "tumor_exp.txt", 
  package="tumorcomparer")

cell_line_mut_file <- system.file("extdata", "READ_data_for_running_TC", "cell_line_mut.txt", 
  package="tumorcomparer")
cell_line_cna_file <- system.file("extdata", "READ_data_for_running_TC", "cell_line_cna.txt", 
  package="tumorcomparer")
cell_line_exp_file <- system.file("extdata", "READ_data_for_running_TC", "cell_line_exp.txt", 
  package="tumorcomparer")

known_cancer_gene_weights_mut_file <- system.file("extdata", "READ_data_for_running_TC", 
  "default_weights_for_known_cancer_genes_mut.txt", package="tumorcomparer")
known_cancer_gene_weights_cna_file <- system.file("extdata", "READ_data_for_running_TC", 
  "default_weights_for_known_cancer_genes_cna.txt", package="tumorcomparer")
known_cancer_gene_weights_exp_file <- system.file("extdata", "READ_data_for_running_TC", 
  "default_weights_for_known_cancer_genes_exp.txt", package="tumorcomparer")

cancer_specific_gene_weights_mut_file <- system.file("extdata", "READ_data_for_running_TC", 
  "Genes_and_weights_mut.txt", package="tumorcomparer")
cancer_specific_gene_weights_cna_file <- system.file("extdata", "READ_data_for_running_TC", 
  "Genes_and_weights_cna.txt", package="tumorcomparer")
cancer_specific_gene_weights_exp_file <- system.file("extdata", "READ_data_for_running_TC", 
  "Genes_and_weights_exp.txt", package="tumorcomparer")
  
config_list <- list(
  mut=list(dataset_name = "mut", data_type_weight=1/3, default_weight = 0.01, 
    tumor_file = tumor_mut_file, cell_line_file = cell_line_mut_file,
    known_cancer_gene_weights_file = known_cancer_gene_weights_mut_file, 
    cancer_specific_gene_weights_file = cancer_specific_gene_weights_mut_file),
  cna=list(dataset_name = "cna", data_type_weight=1/3, default_weight = 0.01, 
    tumor_file = tumor_cna_file, cell_line_file = cell_line_cna_file,
    known_cancer_gene_weights_file = known_cancer_gene_weights_cna_file, 
    cancer_specific_gene_weights_file = cancer_specific_gene_weights_cna_file),
  exp=list(dataset_name = "exp", data_type_weight=1/3, default_weight = 0.01, 
    tumor_file = tumor_exp_file, cell_line_file = cell_line_exp_file,
    known_cancer_gene_weights_file = known_cancer_gene_weights_exp_file, 
    cancer_specific_gene_weights_file = cancer_specific_gene_weights_exp_file)
)

comparison_result <- run_comparison_config_list(config_list = config_list)


cannin/tumorcomparer documentation built on Feb. 7, 2023, 3:13 p.m.