SetupAndPosterior: Generate an HDP Gibbs sampling chain from a spectra catalog

View source: R/SetupAndPosterior.R

SetupAndPosteriorR Documentation

Generate an HDP Gibbs sampling chain from a spectra catalog

Description

Generate an HDP Gibbs sampling chain from a spectra catalog

Usage

SetupAndPosterior(
  input.catalog,
  seedNumber = 1,
  K.guess,
  multi.types = FALSE,
  verbose = TRUE,
  burnin = 5000,
  post.n = 50,
  post.space = 50,
  post.cpiter = 3,
  post.verbosity = 0,
  gamma.alpha = 1,
  gamma.beta = 20,
  burnin.multiplier = 2,
  checkpoint = TRUE
)

Arguments

input.catalog

Input spectra catalog as a matrix or in ICAMS format.

seedNumber

A random seed that ensures ensures reproducible results.

K.guess

Suggested initial value of the number of raw clusters. Usually, the number of raw clusters is roughly twice the number of extracted signatures. Passed to hdpx::dp_activate as argument initcc.

multi.types

A logical scalar or a character vector.

If FALSE, The HDP analysis will regard all input spectra as one tumor type, and the HDP structure will have one parent node for all tumors.

If TRUE, Sample IDs in input.catalog must have the form sample_type::sample_id.

If a character vector, then its length must be ncol(input.catalog), and each value is the sample type of the corresponding column in input.catalog, e.g. c(rep("Type-A", 23), rep("Type-B", 10)) for 23 Type-A samples and 10 Type-B samples.

If not FALSE, HDP will have one parent node for each sample type and one grandparent node.

verbose

If TRUE then message progress information.

burnin

The number of burn-in iterations in one batch. The total number of burnin iterations is burnin * burnin.multiplier.

post.n

The number of posterior samples to collect.

post.space

The number of iterations between collected samples.

post.cpiter

The number of iterations of concentration parameter samplings to perform after each iteration.

post.verbosity

Verbosity of debugging statements. No need to change except for development purposes.

gamma.alpha

Shape parameter of the gamma distribution prior for the Dirichlet process concentration parameters α_0 and all α_j in Figure B.1 of

  • https://www.repository.cam.ac.uk/bitstream/handle/1810/275454/Roberts-2018-PhD.pdf

gamma.beta

Inverse scale parameter (rate parameter) of the gamma distribution prior for the Dirichlet process concentration parameters: β_0 and all β_j in Figure B.1 of

  • https://www.repository.cam.ac.uk/bitstream/handle/1810/275454/Roberts-2018-PhD.pdf

We recommend gamma.alpha = 1 and gamma.beta = 20 for single-base-substitution signature extraction; gamma.alpha = 1 and gamma.beta = 50 for doublet-base-substitution and indel signature extraction

burnin.multiplier

Run burnin.multiplier rounds of burnin iterations. If checkpoint is TRUE, save the burnin chain (see parameter checkpoint.) The diagnostic plot diagnostics.likelihood.pdf can help determine if the chain is stationary. The burnin can be continued from a checkpoint file with ExtendBurnin (see argument checkpoint).

checkpoint

If TRUE, then

  • Checkpoint each final Gibbs sample chain to the current working directory, in a file called mSigHdp.sample.checkpoint.x.Rdata, where x depends on seedNumber.

  • Periodically checkpoint the burnin state to the current working directory, in files called mSigHdp.burnin.checkpoint.x.Rdata, where x depends on the seedNumber.

Value

Invisibly, an hdpSampleChain-class object as returned from hdp_posterior.


steverozen/mSigHdp documentation built on Feb. 6, 2023, 1:36 a.m.