prepareDataObjects: Prepare the data object required for downstream analysis

View source: R/data_process_funcs.R

prepareDataObjectsR Documentation

Prepare the data object required for downstream analysis

Description

The function processes the pan-cancer data and returns an object with viabilities matrix, mutation matrix, mutation annotations and primary site for different types of cancers.

Usage

prepareDataObjects(
  data,
  x,
  fdr = 0.05,
  min_Nmut = 2,
  all_cancers_mut_df,
  CN_df,
  gistic = FALSE,
  top_drivers = NULL,
  CN_Thr = 2,
  minNrcelllines = 5,
  celllines,
  meta_data,
  essential_genes = NULL
)

Arguments

data

input data frame of cell line viabilities for different gene knockdowns

x

primary site

fdr

fdr cut-off for choosing the top drivers from mutSig2 list of drivers. Default = 0.05

min_Nmut

lower bound of number of cell lines with mutations. Default = 2

all_cancers_mut_df

MAF file from CCLE

CN_df

copy number dataframe from CCLE

gistic

Logical variable checking if copy number is based on Gistic. Default = FALSE

top_drivers

vector of driver genes of interest. Default = NULL

CN_Thr

threshold for using CN data. Values: 0 = Homozygous and heterozygous deletions ; 1 = Homozygous deletions only; 2 = No copy number used (default)

minNrcelllines

lower bound of number of cell lines. Default = 5

celllines

vector of interested celllines

meta_data

information on different sub types for each primary site

essential_genes

vector of essential genes

Value

An object for each cancer type

viabilities

dataframe of viabilities for each cancer type

mutations

matrix of mutations in drivers for each cancer type

CNalterations

matrix of non-negative copy number alterations of drivers for each cancer type

mutation_annot

annotations of the mutations

primary_site

cancer type


cbg-ethz/slidr documentation built on Feb. 8, 2023, 11:15 p.m.