Description Usage Arguments Details See Also Examples
Permutation-based identification of Significantly Mutated Genes, i.e. MutSigCL or MutSigFN.
1 | mutsigclfn(bkgrSQLiteDB, obs_data, outfile='out.txt', genes=c(), type=c('CL','FN'), hotspot.alg=c('hclust','ratio'), min.cl=0.2, nperm=1000, mc.cores=4, bkgr_data=dbConnect(dbDriver('SQLite'), bkgrSQLiteDB))
|
bkgrSQLiteDB |
An SQLite DB storing background mutation-related info. |
obs_data |
A data frame of the observed input data. |
outfile |
Output file for mutsigclfn. |
genes |
Genes to be permutated. |
type |
Algorithm, i.e. CL - MutSigCL, FN - MutSigFN. |
hotspot.alg |
Algorithm to define hotspot statistic. |
min.cl |
Genes with fraction of hotspot >min.cl are selected for MutSigCL analysis. To disable it set min.cl to 0. |
nperm |
Number of permutation. |
mc.cores |
Number of cores used in mclapply. |
bkgr_data |
An RSQLite object, never change unless your're quite sure! |
Make sure that bkgrSQLiteDB
and obs_data
are consistent.
When hotspot.alg is set to 'hclust', mutsigclfn will employ the following algorithm to define hotspot statistic (fraction):
"A hotspot is defined as a 3-base-pair region of the gene containing many mutations: at least 2, and at least 2 the total mutations (nature12912)."
This involves using hclust
in package stats
or fastcluster
to perform hierarchical clustering and
call cutree
to calculate hotspot statistic. It's quite time-consuming in large permutations.
When hotspot.alg is set to 'ratio' (much faster than 'hclust' above), I define the hotspot statistic as:
The ratio of the number of mutations to the total number of mutation positions, at least <e2><89><a5>2 mutations is required.
Change Log:
Function name pbiSMG
was changed to mutsigclfn
in v 1.45.
1 2 3 4 |
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.