View source: R/RunHdpxParallel.R
RunHdpxParallel | R Documentation |
Please see the vignette for an example.
RunHdpxParallel( input.catalog, seedNumber = 123, K.guess, multi.types = FALSE, verbose = FALSE, burnin = 1000, burnin.multiplier = 10, post.n = 200, post.space = 100, post.cpiter = 3, post.verbosity = 0, CPU.cores = 20, num.child.process = 20, high.confidence.prop = 0.9, hc.cutoff = NULL, merge.raw.cluster.args = hdpx::default_merge_raw_cluster_args(), overwrite = TRUE, out.dir = paste0("./RunHdpxParallel_out_", as.numeric(Sys.time())), gamma.alpha = 1, gamma.beta = 20, checkpoint = TRUE, downsample_threshold = NULL )
input.catalog |
Input spectra catalog as a matrix or
in |
seedNumber |
A random seed that ensures ensures reproducible results. |
K.guess |
Suggested initial value of the number of raw clusters. Usually, the number of raw clusters is roughly twice the number of extracted signatures. Passed to hdpx::dp_activate as argument initcc. |
multi.types |
A logical scalar or a character vector. If If If a character vector, then its length must be If not |
verbose |
If |
burnin |
The number of burn-in iterations in
one batch. The total number of burnin iterations is
|
burnin.multiplier |
Run |
post.n |
The number of posterior samples to collect. |
post.space |
The number of iterations between collected samples. |
post.cpiter |
The number of iterations of concentration parameter samplings to perform after each iteration. |
post.verbosity |
Verbosity of debugging statements. No need to change except for development purposes. |
CPU.cores |
Number of CPUs to use; this should be no
more than |
num.child.process |
Number of posterior sampling chains; can set to 1 for testing. We recommend 20 for real data analysis |
high.confidence.prop |
Raw clusters of mutations
found in >= |
hc.cutoff |
Deprecated, use |
merge.raw.cluster.args |
See |
overwrite |
If |
out.dir |
If not |
gamma.alpha |
Shape parameter of the gamma distribution prior for the Dirichlet process concentration parameters α_0 and all α_j in Figure B.1 of
|
gamma.beta |
Inverse scale parameter (rate parameter) of the gamma distribution prior for the Dirichlet process concentration parameters: β_0 and all β_j in Figure B.1 of
We recommend gamma.alpha = 1 and gamma.beta = 20 for single-base-substitution signature extraction; gamma.alpha = 1 and gamma.beta = 50 for doublet-base-substitution and indel signature extraction |
checkpoint |
If
|
downsample_threshold |
See |
Please see our paper at https://www.biorxiv.org/content/10.1101/2022.01.31.478587v1 for suggestions on argument values to use.
Invisibly, a list with the following elements:
The extracted signature profiles as a matrix; rows are mutation types, columns are signatures with high confidence.
A data frame with two columns. The first
column corresponds to each signature in signature
and the second columns contains the number of posterior
samples that found the raw clusters contributing to the signature.
A numeric data frame. Columns correspond to signatures
as in signature
. Rows correspond to either biological
samples or to parent and grandparent Dirichlet processes.
The inferred exposures as a matrix
of mutation probabilities;
rows are signatures, columns are samples (e.g. tumors). This is
similar to signature.cdc
, but every column was normalized
to sum to 1.
The profiles of signatures extracted
with low confidence as a matrix; rows are mutation types,
columns are signatures with <
high.confidence.prop
of posterior samples.
Analogous to signature.post.samp.number
, except that
this one is for
signatures in low.confidence.signature
.
Analogous to
signature.cdc
, except that columns in this
matrix correspond to columns
in low.confidence.signature
.
A list object returned from
extract_components
in package hdpx.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.