generate_composite_mat_and_gene_weights: Run a comparison between between two cohorts (e.g. cell lines...

View source: R/generate_composite_mat_and_gene_weights.R

generate_composite_mat_and_gene_weightsR Documentation

Run a comparison between between two cohorts (e.g. cell lines and tumors)

Description

Run a comparison between between two cohorts (e.g. cell lines and tumors)

Usage

generate_composite_mat_and_gene_weights(
  default_weight,
  tumor_file,
  cell_line_file,
  known_cancer_gene_weights_file = NULL,
  cancer_specific_gene_weights_file = NULL,
  gene_list = NULL,
  run_mds = TRUE,
  verbose = FALSE
)

Arguments

default_weight

see run_comparison

tumor_file

see run_comparison

cell_line_file

see run_comparison

known_cancer_gene_weights_file

see run_comparison

cancer_specific_gene_weights_file

see run_comparison

gene_list

a vector of HGNC gene symbols to run comparison only for the specified genes (Default: NULL)

run_mds

a boolean, whether to run multidimensional scaling (MDS) on dataset (Default: TRUE)

verbose

show debugging information

Details

The composite matrix is a single matrix where the columns are samples (i.e. tumors AND cell line IDs) and the rows are an rbind() of mutations (with 1 or 0 outputs for each sample), copy number alterations from GISTIC (with values -2, -1, 0, 1, 2), or gene expression values. Available similarity/distance measures include:

  • "weighted_correlation"Weighted correlation, based on weighted means and standard deviations

  • "generalized_jaccard"A weighted distance based on the Jaccard coefficient

Value

a list with multiple items. NOTE: The values of the dist and isomdsfit will depend on whether the input data is continuous or discrete.

  • "dist_mat"a matrix of pairwise distances

  • "isomdsfit"a two-column (2-dimension) fitting of the distances reduced to two dimensions via MDS - multidimensional scaling, using the isoMDS function

  • "cor_unweighted"a matrix of unweighted pairwise correlations

  • "composite_mat"the composite matrix (see Details)

  • "cell_lines_ids"a vector of cell line IDs/names

  • "tumors_ids"a vector of tumor IDs

Author(s)

Rileen Sinha (rileen@gmail.com), Augustin Luna (aluna@jimmy.harvard.edu)

Examples

mut_default_weight <- 0.01
tumor_mut_file <- system.file("extdata", "READ_data_for_running_TC", "tumor_mut.txt", 
  package="tumorcomparer")
cell_line_mut_file <- system.file("extdata", "READ_data_for_running_TC", "cell_line_mut.txt", 
  package="tumorcomparer")
known_cancer_gene_weights_mut_file <- system.file("extdata", "READ_data_for_running_TC", 
  "default_weights_for_known_cancer_genes_mut.txt", package="tumorcomparer")
cancer_specific_gene_weights_mut_file <- system.file("extdata", "READ_data_for_running_TC", 
  "Genes_and_weights_mut.txt", package="tumorcomparer")

mut <- generate_composite_mat_and_gene_weights(
  default_weight=mut_default_weight,
  tumor_file=tumor_mut_file,
  cell_line_file=cell_line_mut_file,
  known_cancer_gene_weights_file=known_cancer_gene_weights_mut_file,
  cancer_specific_gene_weights_file=cancer_specific_gene_weights_mut_file)


cannin/tumorcomparer documentation built on Feb. 7, 2023, 3:13 p.m.