SynSigGen: Create Catalogs of Synthetic Mutational Spectra

#' @title SynSigGen
#'
#' @description Create catalogs of synthetic mutational spectra for
#' assessing the performance of mutational-signature analysis programs.
#'
#' @section Overview:
#'
#' The main focus is generating synthetic catalogs of mutational
#' spectra (mutations in tumors) based on known mutational signature
#' profiles and software-inferred exposures (software's estimate on
#' number of mutations induced by mutational signatures in tumors)
#' in the PCAWG7 data. We call this kind of synthetic data broadly
#' "reality-based" synthetic data.
#' The package also has a set of functions that generate
#' random mutational signature profiles and then create synthetic
#' mutational spectra based on these random signature profiles. We
#' call this kind of synthetic data "random" synthetic data, while
#' pointing out that much depends on the distributions from which
#' the random signature profiles and attributions are generated.
#'
#' @section Workflow for generating "reality-based" synthetic mutational spectra:
#'
#' Typical workflow for generating synthetic mutational spectra 
#' is as follows. \enumerate{
#'
#' \item Input (based on SignatureAnalyzer or SigProfiler analysis of PCAWG tumors)
#'   \code{E}, matrix of software-inferred exposures of mutational signatures (signatures x samples)
#'   \code{S}, mutational signature profiles (mutation types x signatures)
#'
#' \item Obtain distribution parameters from software-inferred exposures \preformatted{
#'   P <- GetSynSigParamsFromExposures(E, ...)
#' }
#'
#' \item Generate exposures for synthetic mutational spectra based on \code{P} \preformatted{
#'   synthetic.exposures <- GenerateSyntheticExposures(P, ...)
#' }
#'
#' \item Generate synthetic mutational spectra by multiplying \code{S} and \code{synthetic.exposures},
#' and round the product to the nearest unit: \preformatted{
#'   synthetic.spectra <- CreateAndWriteCatalog(S, synthetic.exposures, ...)
#' }
#'
#' }
#'
#' @section Workflow for generating "random" synthetic mutational spectra:
#'
#' The top-level function for generating "random" synthetic mutational spectra is
#' \code{\link{CreateRandomSyn}}. It adopts the following steps to generate
#' catalogs of "random" synthetic mutational spectra. \enumerate{
#'
#' \item Create random mutational signature profiles: \preformatted{
#'   S <- CreateRandomMutSigProfiles(...)
#' }
#'
#' \item Generate distribution parameters for exposures of random signatures: \preformatted{
#'   P <- CreateMeanAndStdevForSigs(sig.names = colnames(S),...)
#' }
#'
#' \item Create exposures for mutational signatures based on \code{P} and other
#' parameters: \preformatted{
#'   synthetic.exposures <- CreateRandomExposures(sigs = S, per.sig.mean.and.sd = P)
#' }
#'
#' \item Generate synthetic mutational spectra by multiplying \code{S} and \code{synthetic.exposures}
#' and round the product to the nearest unit: \preformatted{
#'   synthetic.spectra <- NewCreateAndWriteCatalog(S, synthetic.exposures, ...)
#' }
#'
#' }
#'
#' @section Function for generating "SBS1-SBS5-correlated" synthetic mutational spectra:
#'
#' \code{CreateSBS1SBS5CorrelatedSyntheticData} is the top-level function for
#' generating 20 data sets which only have 2 active signatures (SBS1 and SBS5)
#' with positively-correlated exposures.
#'
#' This function is used for generating synthetic mutational spectra used in paper
#' "Performance of Mutational Signature Software on Correlated Signatures".
#'
#'
#'
#' @section Functions for generating synthetic tumor spectra used in paper \emph{The repertoire of mutational signatures in human cancer}:
#'
#' \emph{The repertoire of mutational signatures in human cancer} (https://doi.org/10.1038/s41586-020-1943-3)
#' involves evaluation of performances on two computational approaches
#' (\code{SigProfiler} and \code{SignatureAnalyzer}) on 11 synthetic data sets
#' (Synapse ID: syn18497223). \enumerate{
#'
#' \item Function \code{\link{PancAdenoCA1000}} creates 1000 pancreatic adenocarcinoma
#' spectra data set (syn18500212).
#'
#' \item Script \preformatted{
#'
#'
#' } creates 2,700 synthetic spectra (syn18500213). This data set consists of 9 cancer types
#' each with 300 synthetic tumors: \itemize{
#' \item bladder transitional cell carcinoma,
#' \item oesophageal adenocarcinoma,
#' \item breast adenocarcinoma,
#' \item lung squamous cell carcinoma,
#' \item renal cell carcinoma, 
#' \item ovarian adenocarcinoma,
#' \item osteosarcoma,
#' \item cervical adenocarcinoma and
#' \item stomach adenocarcinoma.}
#'
#' \item Function \code{\link{RCCOvary1000}} creates spectra dataset consists of
#' 500 synthetic kidney (RCC) with high prevalence and mutation load from
#' SBS5 and SBS40 signatures, and 500 synthetic ovarian adenocarcinoma with
#' high prevalence and mutation load from SBS3.
#'
#' \strong{Notes:} \itemize{
#' \item Mutation loads from other mutational signatures (besides SBS3, SBS5, SBS30)
#' also exist in the spectra dataset created by function \link{RCCOvary1000};
#'
#' \item SBS3, SBS5, SBS40 are flat signatures. This dataset challenges the computational
#' approaches on accurately separating these 3 mutational signatures, as mixing SBS5 and
#' SBS40 can get a mutational signature similar to SBS3.
#' }
#'
#' \item Function \code{\link{Create.3.5.40.Abstract}} creates 1000 synthetic spectra all constructed
#' entirely from SBS3, SBS5, and SBS40, using mutational loads modelled on kidney-RCC
#' (SBS5 and SBS40) and ovarian adenocarcinoma (SBS3). Most synthetic spectra have contributions
#' from all three signatures.
#'
#' }
#'
#'
#' @docType package
#' @name SynSigGen
#'
NULL

steverozen/SynSigGen documentation built on April 1, 2022, 8:54 p.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

steverozen/SynSigGen
Create Catalogs of Synthetic Mutational Spectra

R/SynSigGen.R
In steverozen/SynSigGen: Create Catalogs of Synthetic Mutational Spectra

R Package Documentation

Browse R Packages

We want your feedback!

steverozen/SynSigGen Create Catalogs of Synthetic Mutational Spectra

R/SynSigGen.R In steverozen/SynSigGen: Create Catalogs of Synthetic Mutational Spectra

R Package Documentation

Browse R Packages

We want your feedback!

steverozen/SynSigGen
Create Catalogs of Synthetic Mutational Spectra

R/SynSigGen.R
In steverozen/SynSigGen: Create Catalogs of Synthetic Mutational Spectra