bcgr: Background mutation rate calculated based on the observed...
In hanasusak/cDriver: Find best driver candidate genes

Description Usage Arguments Details Value Examples

bcgr function calculates the background probability that a gene is mutated based on the frequency of silent mutations.

1 2	bcgr(sample.mutations, genes = NULL, Variant_Classification = NULL, Hugo_Symbol = NULL, Tumor_Sample_Barcode = NULL, CCF = NULL)

`sample.mutations`	data frame in MAF like format with nonsilent and silent mutations. Columns names/header in `sample.mutations` must be: Variant_Classification : column specified in the MAF format, which distinguishes between silent and nonsilent SNVs Hugo_Symbol : column specified in the MAF format, which reports the gene name for each SNV. Tumor_Sample_Barcode : column specified in the MAF format, which reports in wich patient the SNV was found. CCF : numeric column produce by `CCF` function, or calculated previously for each SNV.
`genes`	vector of genes which were sequenced. Vector of unique values of Hugo_Symbol names (with possibility of more additional genes which did not have any SNV in the cohort). Default is NULL value and then list of unique genes is taken from `sample.mutations`.
`Variant_Classification`	(optional) integer/numeric value indicating which column in `sample.mutations` contains the classification for the SNVs (Silent or not). Default is NULL value (in this case `sample.mutations` should already have this column). Column with this name should not already exist in `sample.mutations`.
`Hugo_Symbol`	(optional) integer/numeric value indicating which column in `sample.mutations` contains the gene names for the SNVs. Default is NULL value (in this case `sample.mutations` should already have this column) Column with this name should not already exist in `sample.mutations`.
`Tumor_Sample_Barcode`	(optional) integer/numeric value indicating which column in `sample.mutations` contains the sample ids for the SNVs. Default is NULL value (in this case `sample.mutations` should already have this column) Column with this name should not already exist in `sample.mutations`.
`CCF`	(optional) integer/numeric value indicating which column in `sample.mutations` contains the cancer cell fraction information for the SNVs. Default is NULL value (in this case `sample.mutations` should already have this column) Column with this name should not already exist in `sample.mutations`.

Assuming neutral selection, the function estimates the expected number of nonsilent mutations from observed number of silent mutations. Na (number of all possible nonsilent substitutions) and Ns (number of all possible silent substitutions) were taken from Lawrence paper. They are provided in this package in the file lawrence.RData. When the expected number of nonsilent mutations for each gene is known, the probability to get a nonsilent mutation in each gene is calculated. This is based on

a numeric vector of the probabilities that a gene has a nonsilent mutation (not caused by cancer).

# We first need the CCF column
sample.genes.mutect <- CCF(sample.genes.mutect)
somatic.background <- bcgr(sample.genes.mutect, length.genes$Hugo_Symbol)
head(somatic.background)