bcgr.combine: Combining two somatic background mutation probability values...

Description Usage Arguments Value References Examples

View source: R/background.R

Description

bcgr.combine function first calculates both somatic background mutation probabilities (using bcgr.lawrence and bcgr functions) and then it takes the average value for each gene

Usage

1
2
3
bcgr.combine(sample.mutations, genes = NULL, lengthGenes = NULL,
  Variant_Classification = NULL, Hugo_Symbol = NULL,
  Tumor_Sample_Barcode = NULL, CCF = NULL)

Arguments

sample.mutations

data frame in MAF like format. Columns names/header in sample.mutations must be: e:

  • Variant_Classification : column specified in the MAF format, which distinguishes between silent and nonsilent SNVs

  • Hugo_Symbol : column specified in the MAF format, which reports the gene name for each SNV.

  • Tumor_Sample_Barcode : column specified in the MAF format, which reports in wich patient the SNV was found.

  • CCF : numeric column produce by CCF function, or calculated previously for each SNV.

genes

vector of genes which were sequenced. Vector of unique values of Hugo_Symbol names (with possibility of more additional genes which did not have any SNV in the cohort). Default is NULL value and then the list of the unique genes is taken from sample.mutations.

lengthGenes

numeric vector of the lengths (suquenced) for all genes. Vector should have the same order as given in the variable genes. Default is NULL value and then the length of the genes is taken from data set length.genes (form package cDriver) as defined in column length. If a gene is not found in this data frame, then the median value is taken from the gene list provided by default.

Variant_Classification

(optional) integer/numeric value indicating which column in sample.mutations contains the classification for the SNVs (Silent or not). Default is NULL value (in this case sample.mutations should already have this column). Column with this name should not already exist in sample.mutations.

Hugo_Symbol

(optional) integer/numeric value indicating which column in sample.mutations contains the gene names for the SNVs. Default is NULL value (in this case sample.mutations should already have this column) Column with this name should not already exist in sample.mutations.

Tumor_Sample_Barcode

(optional) integer/numeric value indicating which column in sample.mutations contains the sample ids for the SNVs. Default is NULL value (in this case sample.mutations should already have this column) Column with this name should not already exist in sample.mutations.

CCF

(optional) integer/numeric value indicating which column in sample.mutations contains the cancer cell fraction information for the SNVs. Default is NULL value (in this case sample.mutations should already have this column) Column with this name should not already exist in sample.mutations.

Value

a numeric vector of the probabilites that a gene has a nonsilent mutation (not caused by cancer).

References

http://www.ncbi.nlm.nih.gov/pubmed/23770567.

Examples

1
2
3
4
5
# First we need to calculate CCF
sample.genes.mutect <- CCF(sample.genes.mutect)
# Calculate somatic nonsilent background mutation probability
 background <- bcgr.combine(sample.genes.mutect, length.genes$Hugo_Symbol, length.genes$Coverd_len)
 head(background)

hanasusak/cDriver documentation built on May 17, 2019, 2:27 p.m.