dndscv | R Documentation |
Analyses of selection using the dNdScv and dNdSloc models. Default parameters typically increase the performance of the method on cancer genomic studies. Default arguments use the GRCh37/hg19 version of the human genome. To run dNdScv on other assemblies or species see the buildref function and the dndscv_data GitHub repository.
dndscv(
mutations,
gene_list = NULL,
refdb = "hg19",
sm = "192r_3w",
kc = "cgc81",
cv = "hg19",
max_muts_per_gene_per_sample = 3,
max_coding_muts_per_sample = 3000,
use_indel_sites = T,
min_indels = 5,
maxcovs = 20,
constrain_wnon_wspl = T,
outp = 3,
numcode = 1,
outmats = F,
mingenecovs = 500,
onesided = F,
dc = NULL
)
mutations |
Table of mutations (5 columns: sampleID, chr, pos, ref, alt). Only list independent events as mutations. |
gene_list |
List of genes to restrict the analysis (use for targeted sequencing studies) |
refdb |
Reference database (path to .rda file or a pre-loaded array object in the right format) |
sm |
Substitution model (precomputed models are available in the data directory) |
kc |
List of a-priori known cancer genes (to be excluded from the indel background model) |
cv |
Covariates (a matrix of covariates -columns- for each gene -rows-) [default: reference covariates] [cv=NULL runs dndscv without covariates] |
max_muts_per_gene_per_sample |
If n<Inf, arbitrarily the first n mutations by chr position will be kept (default = 3, please set this to Inf to avoid filtering out any mutation) |
max_coding_muts_per_sample |
Hypermutator samples often reduce power to detect selection |
use_indel_sites |
Use unique indel sites instead of the total number of indels (default = TRUE, which tends to be more robust for typical cancer or somatic mutation datasets) |
min_indels |
Minimum number of indels required to run the indel recurrence module |
maxcovs |
Maximum number of covariates that will be considered (additional columns in the matrix of covariates will be excluded) |
constrain_wnon_wspl |
This constrains wnon==wspl in the dNdScv model (this typically leads to higher power to detect selection) |
outp |
Output: 1 = Global dN/dS values; 2 = Global dN/dS and dNdSloc; 3 = Global dN/dS, dNdSloc and dNdScv |
numcode |
NCBI genetic code number (default = 1; standard genetic code). To see the list of genetic codes supported use: ? seqinr::translate. Note that the same genetic code must be used in the dndscv and buildref functions. |
outmats |
Output the internal N and L matrices (default = F) |
mingenecovs |
Minimum number of genes required to run the negative binomial regression model with covariates (default = 500) |
onesided |
Option to run one-sided positive and negative selection tests per gene (default = FALSE). Note that one-sided tests are only performed for the wnon==wspl model, so using onesided=TRUE will overwrite constrain_wnon_wspl to TRUE. |
dc |
Duplex coverage per gene. Named Numeric Vector with values reflecting the mean duplex coverage per site per gene, and names corresponding to gene names. Use this argument only when running dNdScv on duplex sequencing data to use gene coverage in the offset of the regression model (default = NULL) |
Martincorena I, et al. (2017) Universal patterns of selection in cancer and somatic tissues. Cell. 171(5):1029-1041.
'dndscv' returns a list of objects:
- globaldnds: Global dN/dS estimates across all genes.
- sel_cv: Gene-wise selection results using dNdScv.
- sel_loc: Gene-wise selection results using dNdSloc.
- annotmuts: Annotated coding mutations.
- genemuts: Observed and expected numbers of mutations per gene.
- geneindels: Observed and expected numbers of indels per gene.
- mle_submodel: MLEs of the substitution model.
- exclsamples: Samples excluded from the analysis.
- exclmuts: Coding mutations excluded from the analysis.
- nbreg: Negative binomial regression model for substitutions.
- nbregind: Negative binomial regression model for indels.
- poissmodel: Poisson regression model used to fit the substitution model and the global dNdS values.
- wrongmuts: Table of input mutations with a wrong annotation of the reference base (if any).
Inigo Martincorena (Wellcome Sanger Institute)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.