View source: R/sig_fit_bootstrap.R
sig_fit_bootstrap | R Documentation |
This can be used to obtain the confidence of signature exposures or search the suboptimal decomposition solution.
sig_fit_bootstrap(
catalog,
sig,
n = 100L,
sig_index = NULL,
sig_db = "legacy",
db_type = c("", "human-exome", "human-genome"),
show_index = TRUE,
method = c("QP", "NNLS", "SA"),
auto_reduce = FALSE,
SA_not_bootstrap = FALSE,
type = c("absolute", "relative"),
rel_threshold = 0,
mode = c("SBS", "DBS", "ID", "copynumber"),
find_suboptimal = FALSE,
suboptimal_ref_error = NULL,
suboptimal_factor = 1.05,
...
)
catalog |
a named numeric vector or a numeric matrix with dimension Nx1. N is the number of component, 1 is the sample. |
sig |
a |
n |
the number of bootstrap replicates. |
sig_index |
a vector for signature index. "ALL" for all signatures. |
sig_db |
default 'legacy', it can be 'legacy' (for COSMIC v2 'SBS'),
'SBS', 'DBS', 'ID' and 'TSB' (for COSMIV v3.1 signatures)
for small scale mutations.
For more specific details, it can also be 'SBS_hg19', 'SBS_hg38',
'SBS_mm9', 'SBS_mm10', 'DBS_hg19', 'DBS_hg38', 'DBS_mm9', 'DBS_mm10' to use
COSMIC v3 reference signatures from Alexandrov, Ludmil B., et al. (2020) (reference #1).
In addition, it can be one of "SBS_Nik_lab_Organ", "RS_Nik_lab_Organ",
"SBS_Nik_lab", "RS_Nik_lab" to refer reference signatures from
Degasperi, Andrea, et al. (2020) (reference #2);
"RS_BRCA560", "RS_USARC" to reference signatures from BRCA560 and USARC cohorts;
"CNS_USARC" (40 categories), "CNS_TCGA" (48 categories) to reference copy number signatures from USARC cohort and TCGA;
"CNS_TCGA176" (176 categories) and "CNS_PCAWG176" (176 categories) to reference copy number signatures from PCAWG and TCGA separately.
UPDATE, the latest version of reference version can be automatically
downloaded and loaded from https://cancer.sanger.ac.uk/signatures/downloads/
when a option with |
db_type |
only used when |
show_index |
if |
method |
method to solve the minimazation problem. 'NNLS' for non-negative least square; 'QP' for quadratic programming; 'SA' for simulated annealing. |
auto_reduce |
if |
SA_not_bootstrap |
if |
type |
'absolute' for signature exposure and 'relative' for signature relative exposure. |
rel_threshold |
numeric vector, a signature with relative exposure
lower than (equal is included, i.e. |
mode |
signature type for plotting, now supports 'copynumber', 'SBS', 'DBS', 'ID' and 'RS' (genome rearrangement signature). |
find_suboptimal |
logical, if |
suboptimal_ref_error |
baseline error used for finding suboptimal solution.
if it is |
suboptimal_factor |
suboptimal factor to get suboptimal error, default is |
... |
control parameters passing to argument |
a list
Huang X, Wojtowicz D, Przytycka TM. Detecting presence of mutational signatures in cancer with confidence. Bioinformatics. 2018;34(2):330–337. doi:10.1093/bioinformatics/btx604
report_bootstrap_p_value, sig_fit, sig_fit_bootstrap_batch
# This function is designed for processing
# one sample, thus is not very useful in practice
# please check `sig_fit_bootstrap_batch`
# For general purpose -------------------
W <- matrix(c(1, 2, 3, 4, 5, 6), ncol = 2)
colnames(W) <- c("sig1", "sig2")
W <- apply(W, 2, function(x) x / sum(x))
H <- matrix(c(2, 5, 3, 6, 1, 9, 1, 2), ncol = 4)
colnames(H) <- paste0("samp", 1:4)
V <- W %*% H
V
if (requireNamespace("quadprog", quietly = TRUE)) {
H_bootstrap <- sig_fit_bootstrap(V[, 1], W, n = 10, type = "absolute")
## Typically, you have to run many times to get close to the answer
boxplot(t(H_bootstrap$expo))
H[, 1]
## Return P values
## In practice, run times >= 100
## is recommended
report_bootstrap_p_value(H_bootstrap)
## For multiple samples
## Input a list
report_bootstrap_p_value(list(samp1 = H_bootstrap, samp2 = H_bootstrap))
# ## Find suboptimal decomposition
# H_suboptimal <- sig_fit_bootstrap(V[, 1], W,
# n = 10,
# type = "absolute",
# method = "SA",
# find_suboptimal = TRUE
# )
}
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.