RunHdpxParallel: Extract (discover) mutational signatures from a matrix of...
In steverozen/mSigHdp: Mutational signature discovery using HDP (hierarchical Dirichlet process)

RunHdpxParallel

R Documentation

Extract (discover) mutational signatures from a matrix of mutational spectra

Description

Please see the vignette for an example.

Usage

RunHdpxParallel(
  input.catalog,
  seedNumber = 123,
  K.guess,
  multi.types = FALSE,
  verbose = FALSE,
  burnin = 1000,
  burnin.multiplier = 10,
  post.n = 200,
  post.space = 100,
  post.cpiter = 3,
  post.verbosity = 0,
  CPU.cores = 20,
  num.child.process = 20,
  high.confidence.prop = 0.9,
  hc.cutoff = NULL,
  merge.raw.cluster.args = hdpx::default_merge_raw_cluster_args(),
  overwrite = TRUE,
  out.dir = paste0("./RunHdpxParallel_out_", as.numeric(Sys.time())),
  gamma.alpha = 1,
  gamma.beta = 20,
  checkpoint = TRUE,
  downsample_threshold = NULL
)

Arguments

`input.catalog`	Input spectra catalog as a matrix or in `ICAMS` format.
`seedNumber`	A random seed that ensures ensures reproducible results.
`K.guess`	Suggested initial value of the number of raw clusters. Usually, the number of raw clusters is roughly twice the number of extracted signatures. Passed to hdpx::dp_activate as argument initcc.
`multi.types`	A logical scalar or a character vector. If `FALSE`, The HDP analysis will regard all input spectra as one tumor type, and the HDP structure will have one parent node for all tumors. If `TRUE`, Sample IDs in `input.catalog` must have the form sample_type::sample_id. If a character vector, then its length must be `ncol(input.catalog`), and each value is the sample type of the corresponding column in `input.catalog`, e.g. `c(rep("Type-A", 23), rep("Type-B", 10))` for 23 Type-A samples and 10 Type-B samples. If not `FALSE`, HDP will have one parent node for each sample type and one grandparent node.
`verbose`	If `TRUE` then `message` progress information.
`burnin`	The number of burn-in iterations in one batch. The total number of burnin iterations is `burnin * burnin.multiplier`.
`burnin.multiplier`	Run `burnin.multiplier` rounds of `burnin` iterations. If `checkpoint` is `TRUE`, save the burnin chain (see parameter `checkpoint`.) The diagnostic plot `diagnostics.likelihood.pdf` can help determine if the chain is stationary. The burnin can be continued from a checkpoint file with `ExtendBurnin` (see argument `checkpoint`).
`post.n`	The number of posterior samples to collect.
`post.space`	The number of iterations between collected samples.
`post.cpiter`	The number of iterations of concentration parameter samplings to perform after each iteration.
`post.verbosity`	Verbosity of debugging statements. No need to change except for development purposes.
`CPU.cores`	Number of CPUs to use; this should be no more than `num.child.process`.
`num.child.process`	Number of posterior sampling chains; can set to 1 for testing. We recommend 20 for real data analysis
`high.confidence.prop`	Raw clusters of mutations found in >= `high.confidence.prop` proportion of posterior samples are signatures with high confidence.
`hc.cutoff`	Deprecated, use `merge.raw.cluster.args`.
`merge.raw.cluster.args`	See `default_merge_raw_cluster_args` in package `hdpx`.
`overwrite`	If `TRUE` overwrite `out.dir` if it exists, otherwise raise an error.
`out.dir`	If not `NULL` then a character string specifying a directory that will be created for the output, including csv files and plots (pdfs) of extracted signatures and their exposures. If `NULL` no directory will be created and no files will be generated.
`gamma.alpha`	Shape parameter of the gamma distribution prior for the Dirichlet process concentration parameters α_0 and all α_j in Figure B.1 of https://www.repository.cam.ac.uk/bitstream/handle/1810/275454/Roberts-2018-PhD.pdf
`gamma.beta`	Inverse scale parameter (rate parameter) of the gamma distribution prior for the Dirichlet process concentration parameters: β_0 and all β_j in Figure B.1 of https://www.repository.cam.ac.uk/bitstream/handle/1810/275454/Roberts-2018-PhD.pdf We recommend gamma.alpha = 1 and gamma.beta = 20 for single-base-substitution signature extraction; gamma.alpha = 1 and gamma.beta = 50 for doublet-base-substitution and indel signature extraction
`checkpoint`	If `TRUE`, then Checkpoint each final Gibbs sample chain to the current working directory, in a file called mSigHdp.sample.checkpoint.x.Rdata, where x depends on `seedNumber`. Periodically checkpoint the burnin state to the current working directory, in files called mSigHdp.burnin.checkpoint.x.Rdata, where x depends on the `seedNumber`.
`downsample_threshold`	See `downsample_spectra` and `link{show_downsample_curves}`.

Details

Please see our paper at https://www.biorxiv.org/content/10.1101/2022.01.31.478587v1 for suggestions on argument values to use.

Value

Invisibly, a list with the following elements:

signature: The extracted signature profiles as a matrix; rows are mutation types, columns are signatures with high confidence.
signature.post.samp.number: A data frame with two columns. The first column corresponds to each signature in signature and the second columns contains the number of posterior samples that found the raw clusters contributing to the signature.
signature.cdc: A numeric data frame. Columns correspond to signatures as in signature. Rows correspond to either biological samples or to parent and grandparent Dirichlet processes.
exposureProbs: The inferred exposures as a matrix of mutation probabilities; rows are signatures, columns are samples (e.g. tumors). This is similar to signature.cdc, but every column was normalized to sum to 1.
low.confidence.signature: The profiles of signatures extracted with low confidence as a matrix; rows are mutation types, columns are signatures with < high.confidence.prop of posterior samples.
low.confidence.post.samp.number: Analogous to signature.post.samp.number, except that this one is for signatures in low.confidence.signature.
low.confidence.cdc: Analogous to signature.cdc, except that columns in this matrix correspond to columns in low.confidence.signature.
extracted.retval: A list object returned from extract_components in package hdpx.

steverozen/mSigHdp documentation built on Feb. 6, 2023, 1:36 a.m.

steverozen/mSigHdp index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

steverozen/mSigHdp
Mutational signature discovery using HDP (hierarchical Dirichlet process)

RunHdpxParallel: Extract (discover) mutational signatures from a matrix of...
In steverozen/mSigHdp: Mutational signature discovery using HDP (hierarchical Dirichlet process)

Extract (discover) mutational signatures from a matrix of mutational spectra

Description

Usage

Arguments

Details

Value

Related to RunHdpxParallel in steverozen/mSigHdp...

R Package Documentation

Browse R Packages

We want your feedback!

steverozen/mSigHdp Mutational signature discovery using HDP (hierarchical Dirichlet process)

RunHdpxParallel: Extract (discover) mutational signatures from a matrix of... In steverozen/mSigHdp: Mutational signature discovery using HDP (hierarchical Dirichlet process)

Extract (discover) mutational signatures from a matrix of mutational spectra

Description

Usage

Arguments

Details

Value

Related to RunHdpxParallel in steverozen/mSigHdp...

R Package Documentation

Browse R Packages

We want your feedback!

steverozen/mSigHdp
Mutational signature discovery using HDP (hierarchical Dirichlet process)

RunHdpxParallel: Extract (discover) mutational signatures from a matrix of...
In steverozen/mSigHdp: Mutational signature discovery using HDP (hierarchical Dirichlet process)