cutoff.driver: Calculation of suggested cut off for bayesian risk model

Description Usage Arguments Value See Also Examples

View source: R/cutoff.R

Description

cutoff.driver function runs Bayesian driver inference model n times, but with randomly generated gene names (probablity of gene beeing mutated is taken from background model)

Usage

1
2
3
4
5
6
cutoff.driver(sample.mutations, bcgr.prob, n = 100, fdr = 0.1,
  simulation.quantile = 0.5, genes = NULL, prior.driver = NULL,
  gene.mut.driver = NULL, driver.genes = NULL, plot.save = FALSE,
  permutationResults.save = FALSE, Variant_Classification = NULL,
  Hugo_Symbol = NULL, Tumor_Sample_Barcode = NULL, CCF = NULL,
  Damage_score = NULL, mode = "MAX", epsilon = 0.05)

Arguments

sample.mutations

data frame with SNVs and InDels in MAF like format. Columns (with exactly same names) which sample.mutations should have are:

  • Variant_Classification column specifed by MAF format, used to distinguish between silent and nonsilent SNVs

  • Hugo_Symbol column specifed by MAF format, which reports gene for each SNV.

  • Tumor_Sample_Barcode column specifed by MAF format, reporting for each SNV in wich patient was found.

  • CCF numeric column produce by CCF function.

  • Damage_score numeric column with values between 0 and 1, where 1 means very damaging SNV/IndDel and 0 not damaging SNV/InDel

bcgr.prob

a numeric vector, same lenght as genes (should be same orderd also) which gives probability of gene having somatic mutation in healfy population. There are functions for obtaining this vector: bcgr, bcgr.lawrence and bcgr.combine.

n

a integer number indicating how many random genes mutations (by background probablity) tests will be done. Default is 100.

fdr

expected false discover rate. Value can be between 0 and 1, while closer to 0 less false discoveries will be allowed. Default value is 0.1 (10% of ranked genes before suggested cut off are expected to be false postives).

simulation.quantile

represent numeric value between 0 and 1 that will take for each ranking that qunantile from n permutations. Default value is 0.5 (median).

genes

vector of genes which were sequenced. They should be unique values of Hugo_Symbol column (with possibility of more additional genes which did not have any SNV/Indel. in given cohort). Default NULL.

prior.driver

a numeric value representing prior probability that random gene is dirver. Default is set to length(driver.genes)/20000, as it assumed there is ~20000 protein goding genes.

gene.mut.driver

a numeric value or named vector representing likelihood that gene is mutated if it is knowen to be driver. Gene does not need to be mutated if it is driver, as cancers are heterogenious. Default is set to NULL and driver.genes are considered as drivers.

driver.genes

a character vector of genes which are considered as drivers for this cancer. If NULL then used set is driver.genes.concensus.

plot.save

a boolean variable to indicate if plot should be saved

permutationResults.save

a boolean variable to indicate if n permutations results should be saved

Variant_Classification

(optional) integer/numeric value indicating column in sample.mutations which contain classification for SNV (Silent or not). Default is NULL value (in this case sample.mutations should already have this column)

Hugo_Symbol

(optional) integer/numeric value indicating column in sample.mutations having gene names for reported SNVs/Indels. Default is NULL value (in this case sample.mutations should already have this column)

Tumor_Sample_Barcode

(optional) integer/numeric value indicating column in sample.mutations which have sample ids for SNVs/Indels. Default is NULL value (in this case sample.mutations should already have this column)

CCF

(optional) integer/numeric value indicating column in sample.mutations which have cancer cell fraction information for SNVs/Indels. Default is NULL value (in this case sample.mutations should already have this column)

Damage_score

(optional) integer/numeric value indicating column in sample.mutations which contain damage score for SNVs/Indels. Default is NULL value (in this case sample.mutations should already have this column)

mode

a charechter value indicationg how to solve when in one gene sample pair there are multiple mutations. Options are SUM, MAX and ADVANCE

epsilon

a numeric value. If mode is ADVANCE, epsilone value will be threshold for CCF difference to decide if they are in same or different clone.

Value

a integer value, where suggested cut off for ranking is.

See Also

CCF, bcgr, bcgr.lawrence, bcgr.combine and bayes.driver

Examples

1
2
3
4
5
6
7
# first calculate CCF
sample.genes.mutect <- CCF(sample.genes.mutect)
# then somatic background probability
bcgr.prob <- bcgr.combine(sample.genes.mutect)
# bayes risk model suggested cut off
suggested.cut.off <- cutoff.driver(sample.genes.mutect,  bcgr.prob) 
print(suggested.cut.off)  

hanasusak/cDriver documentation built on Jan. 20, 2018, 2:14 p.m.