sig_fit | R Documentation |
The function performs a signatures decomposition of a given mutational
catalogue V
with known signatures W
by solving the minimization problem
min(||W*H - V||)
where W and V are known.
sig_fit(
catalogue_matrix,
sig,
sig_index = NULL,
sig_db = c("legacy", "SBS", "DBS", "ID", "TSB", "SBS_Nik_lab", "RS_Nik_lab",
"RS_BRCA560", "RS_USARC", "CNS_USARC", "CNS_TCGA", "CNS_TCGA176", "CNS_PCAWG176",
"SBS_hg19", "SBS_hg38", "SBS_mm9", "SBS_mm10", "DBS_hg19", "DBS_hg38", "DBS_mm9",
"DBS_mm10", "SBS_Nik_lab_Organ", "RS_Nik_lab_Organ", "latest_SBS_GRCh37",
"latest_DBS_GRCh37", "latest_ID_GRCh37", "latest_SBS_GRCh38", "latest_DBS_GRCh38",
"latest_SBS_mm9", "latest_DBS_mm9", "latest_SBS_mm10", "latest_DBS_mm10",
"latest_SBS_rn6", "latest_DBS_rn6", "latest_CN_GRCh37",
"latest_RNA-SBS_GRCh37", "latest_SV_GRCh38"),
db_type = c("", "human-exome", "human-genome"),
show_index = TRUE,
method = c("QP", "NNLS", "SA"),
auto_reduce = FALSE,
type = c("absolute", "relative"),
return_class = c("matrix", "data.table"),
return_error = FALSE,
rel_threshold = 0,
mode = c("SBS", "DBS", "ID", "copynumber"),
true_catalog = NULL,
...
)
catalogue_matrix |
a numeric matrix |
sig |
a |
sig_index |
a vector for signature index. "ALL" for all signatures. |
sig_db |
default 'legacy', it can be 'legacy' (for COSMIC v2 'SBS'),
'SBS', 'DBS', 'ID' and 'TSB' (for COSMIV v3.1 signatures)
for small scale mutations.
For more specific details, it can also be 'SBS_hg19', 'SBS_hg38',
'SBS_mm9', 'SBS_mm10', 'DBS_hg19', 'DBS_hg38', 'DBS_mm9', 'DBS_mm10' to use
COSMIC v3 reference signatures from Alexandrov, Ludmil B., et al. (2020) (reference #1).
In addition, it can be one of "SBS_Nik_lab_Organ", "RS_Nik_lab_Organ",
"SBS_Nik_lab", "RS_Nik_lab" to refer reference signatures from
Degasperi, Andrea, et al. (2020) (reference #2);
"RS_BRCA560", "RS_USARC" to reference signatures from BRCA560 and USARC cohorts;
"CNS_USARC" (40 categories), "CNS_TCGA" (48 categories) to reference copy number signatures from USARC cohort and TCGA;
"CNS_TCGA176" (176 categories) and "CNS_PCAWG176" (176 categories) to reference copy number signatures from PCAWG and TCGA separately.
UPDATE, the latest version of reference version can be automatically
downloaded and loaded from https://cancer.sanger.ac.uk/signatures/downloads/
when a option with |
db_type |
only used when |
show_index |
if |
method |
method to solve the minimazation problem. 'NNLS' for non-negative least square; 'QP' for quadratic programming; 'SA' for simulated annealing. |
auto_reduce |
if |
type |
'absolute' for signature exposure and 'relative' for signature relative exposure. |
return_class |
string, 'matrix' or 'data.table'. |
return_error |
if |
rel_threshold |
numeric vector, a signature with relative exposure
lower than (equal is included, i.e. |
mode |
signature type for plotting, now supports 'copynumber', 'SBS', 'DBS', 'ID' and 'RS' (genome rearrangement signature). |
true_catalog |
used by sig_fit_bootstrap, user never use it. |
... |
control parameters passing to argument |
The method 'NNLS' solves the minimization problem with nonnegative least-squares constraints. The method 'QP' and 'SA' are modified from SignatureEstimation package. See references for details. Of note, when fitting exposures for copy number signatures, only components of feature CN is used.
The exposure result either in matrix
or data.table
format.
If return_error
set TRUE
, a list
is returned.
Daniel Huebschmann, Zuguang Gu and Matthias Schlesner (2019). YAPSA: Yet Another Package for Signature Analysis. R package version 1.12.0.
Huang X, Wojtowicz D, Przytycka TM. Detecting presence of mutational signatures in cancer with confidence. Bioinformatics. 2018;34(2):330–337. doi:10.1093/bioinformatics/btx604
Kim, Jaegil, et al. "Somatic ERCC2 mutations are associated with a distinct genomic signature in urothelial tumors." Nature genetics 48.6 (2016): 600.
sig_extract, sig_auto_extract, sig_fit_bootstrap, sig_fit_bootstrap_batch
# For mutational signatures ----------------
# SBS is used for illustration, similar
# operations can be applied to DBS, INDEL, CN, RS, etc.
# Load simulated data
data("simulated_catalogs")
data = simulated_catalogs$set1
data[1:5, 1:5]
# Fitting with all COSMIC v2 reference signatures
sig_fit(data, sig_index = "ALL")
# Check ?sig_fit for sig_db options
# e.g., use the COSMIC SBS v3
sig_fit(data, sig_index = "ALL", sig_db = "SBS")
# Fitting with specified signatures
# opt 1. use selected reference signatures
sig_fit(data, sig_index = c(1, 5, 9, 2, 13), sig_db = "SBS")
# opt 2. use user specified signatures
ref = get_sig_db()$db
ref[1:5, 1:5]
ref = ref[, 1:10]
# The `sig` used here can be result object from `sig_extract`
# or any reference matrix with similar structure (96-motif)
v1 = sig_fit(data, sig = ref)
v1
# If possible, auto-reduce the reference signatures
# for better fitting data from a sample
v2 = sig_fit(data, sig = ref, auto_reduce = TRUE)
v2
all.equal(v1, v2)
# Some samples reported signatures dropped
# but its original activity values are 0s,
# so the data remain same (0 -> 0)
all.equal(v1[, 2], v2[, 2])
# For COSMIC_10, 6.67638 -> 0
v1[, 4]; v2[, 4]
all.equal(v1[, 4], v2[, 4])
# For general purpose -----------------------
W <- matrix(c(1, 2, 3, 4, 5, 6), ncol = 2)
colnames(W) <- c("sig1", "sig2")
W <- apply(W, 2, function(x) x / sum(x))
H <- matrix(c(2, 5, 3, 6, 1, 9, 1, 2), ncol = 4)
colnames(H) <- paste0("samp", 1:4)
V <- W %*% H
V
if (requireNamespace("quadprog", quietly = TRUE)) {
H_infer <- sig_fit(V, W, method = "QP")
H_infer
H
H_dt <- sig_fit(V, W, method = "QP", auto_reduce = TRUE, return_class = "data.table")
H_dt
## Show results
show_sig_fit(H_infer)
show_sig_fit(H_dt)
## Get clusters/groups
H_dt_rel <- sig_fit(V, W, return_class = "data.table", type = "relative")
z <- get_groups(H_dt_rel, method = "k-means")
show_groups(z)
}
# if (requireNamespace("GenSA", quietly = TRUE)) {
# H_infer <- sig_fit(V, W, method = "SA")
# H_infer
# H
#
# H_dt <- sig_fit(V, W, method = "SA", return_class = "data.table")
# H_dt
#
# ## Modify arguments to method
# sig_fit(V, W, method = "SA", maxit = 10, temperature = 100)
#
# ## Show results
# show_sig_fit(H_infer)
# show_sig_fit(H_dt)
# }
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.