get_population_signatures: SignIT-Pop inference of populations and signatures
In eyzhao/SignIT: What the Package Does (One Line, Title Case)

Description Usage Arguments Details Value

View source: R/get_population_signatures.R

Jointly infers mutational subpopulations and their associated mutation signature exposures

get_population_signatures(
  mutation_table,
  reference_signatures = NULL,
  subset_signatures = TRUE,
  n_populations = NULL,
  genome = NULL,
  method = "vb",
  n_chains = 10,
  n_cores = 1,
  n_iter = 300,
  n_adapt = 200,
  prevalences = NULL
)

`mutation_table`	Table of mutations, one per row. The minimum input requires the following columns: total_depth: Total number of reads covering mutated locus. alt_depth: Total number of mutant reads covering locus. tumour_copy: Tumour copy number at the mutated locus normal_copy: Normal copy number at the mutated locus tumour_content: Estimated tumour content as a fraction between 0 and 1. Must be the same value throughout the whole table.
`reference_signatures`	Reference mutation signatures. This can either be from `get_reference_signatures` or a custom data frame formatted equivalently.
`subset_signatures`	Boolean. If TRUE (default), then `subset_reference_signatures` is run to pre-select a smaller subset of signatures most likely to be active in the cancer. This helps to reduce processing time and model complexity, but may bias the result.
`n_populations`	The number of populations to screen for. Must be an integer. If no value is provided, then a model selection step is engaged to automatically estimate the number of populations. The automatic model selection uses `select_n_populations`, which performs a maximum a posteriori estimate using the SignIT population model (without mutation signature inference).
`genome`	A BSgenome object. This is used to determine trinucleotide contexts of mutations to define mutation types. By default, uses BSgenome.Hsapiens.UCSC.hg19. To define custom mutation types, simply include a column named `mutation_type` in `mutation_table`, in which case this parameter is ignored.
`method`	The posterior sampling method. This is a string and can either be 'vb' for automatic variational Bayes or 'mcmc' for Hamiltonial Monte Carlo.
`n_chains`	Number of chains to sample. Only relevant if `method == 'mcmc'`.
`n_cores`	Number of cores for parallel sampling. By default this equals the number of chains. Only relevant if `method == 'mcmc'`.
`n_iter`	Number of sampling iterations per chain. These are distinct from adaptation iterations, so the total number of iterations will be `n_iter + n_adapt`. Only relevant if `method == 'mcmc'`.
`n_adapt`	Number of adaptation iterations per chain. Only relevant if `method == 'mcmc'`.

get_population_signatures is the central function which facilitates Bayesian inference of mutational populations and signatures. This model infers a matrix of L x N parameters, where L is the number of populations and N is the number of signatures. The posterior distribution of each parameter is estimated using either automatic differentiation variational inference or Hamiltonial Monte Carlo using the vb and sampling methods respectively of the rstan package (an interface to the Stan probabilistic programming language).

A list object with the posterior sampling of population signatures plus relevant input and metadata.

eyzhao/SignIT documentation built on Dec. 6, 2019, 11:45 a.m.