StrelkaSBSVCFFilesToZipFile: Create a zip file which contains catalogs and plot PDFs from...

View source: R/shiny_related_functions.R

StrelkaSBSVCFFilesToZipFileR Documentation

Create a zip file which contains catalogs and plot PDFs from Strelka SBS VCF files

Description

Create 3 SBS catalogs (96, 192, 1536), 3 DBS catalogs (78, 136, 144) from the Strelka SBS VCFs specified by dir, save the catalogs as CSV files, plot them to PDF and generate a zip archive of all the output files. The function will find and merge adjacent SBS pairs into DBS if their VAFs are very similar. The default threshold value for VAF is 0.02.

Usage

StrelkaSBSVCFFilesToZipFile(
  dir,
  zipfile,
  ref.genome,
  trans.ranges = NULL,
  region = "unknown",
  names.of.VCFs = NULL,
  base.filename = "",
  return.annotated.vcfs = FALSE,
  suppress.discarded.variants.warnings = TRUE
)

Arguments

dir

Pathname of the directory which contains only the Strelka SBS VCF files. Each Strelka SBS VCF must have a file extension ".vcf" (case insensitive) and share the same ref.genome and region.

zipfile

Pathname of the zip file to be created.

ref.genome

A ref.genome argument as described in ICAMS.

trans.ranges

Optional. If ref.genome specifies one of the BSgenome object

  1. BSgenome.Hsapiens.1000genomes.hs37d5

  2. BSgenome.Hsapiens.UCSC.hg38

  3. BSgenome.Mmusculus.UCSC.mm10

then the function will infer trans.ranges automatically. Otherwise, user will need to provide the necessary trans.ranges. Please refer to TranscriptRanges for more details. If is.null(trans.ranges) do not add transcript range information.

region

A character string designating a genomic region; see as.catalog and ICAMS.

names.of.VCFs

Optional. Character vector of names of the VCF files. The order of names in names.of.VCFs should match the order of VCFs listed in dir. If NULL(default), this function will remove all of the path up to and including the last path separator (if any) in dir and file paths without extensions (and the leading dot) will be used as the names of the VCF files.

base.filename

Optional. The base name of the CSV and PDF files to be produced; multiple files will be generated, each ending in x.csv or x.pdf, where x indicates the type of catalog.

return.annotated.vcfs

Logical. Whether to return the annotated VCFs with additional columns showing mutation class for each variant. Default is FALSE.

suppress.discarded.variants.warnings

Logical. Whether to suppress warning messages showing information about the discarded variants. Default is TRUE.

Details

This function calls StrelkaSBSVCFFilesToCatalog, PlotCatalogToPdf, WriteCatalog and zip::zipr.

Value

A list containing the following objects:

  • catSBS96, catSBS192, catSBS1536: Matrix of 3 SBS catalogs (one each for 96, 192, and 1536).

  • catDBS78, catDBS136, catDBS144: Matrix of 3 DBS catalogs (one each for 78, 136, and 144).

  • discarded.variants: Non-NULL only if there are variants that were excluded from the analysis. See the added extra column discarded.reason for more details.

  • annotated.vcfs: Non-NULL only if return.annotated.vcfs = TRUE. A list of elements:

    • SBS: SBS VCF annotated by AnnotateSBSVCF with three new columns SBS96.class, SBS192.class and SBS1536.class showing the mutation class for each SBS variant.

    • DBS: DBS VCF annotated by AnnotateDBSVCF with three new columns DBS78.class, DBS136.class and DBS144.class showing the mutation class for each DBS variant.

If trans.ranges is not provided by user and cannot be inferred by ICAMS, SBS 192 and DBS 144 catalog will not be generated. Each catalog has attributes added. See as.catalog for more details.

Note

SBS 192 and DBS 144 catalogs include only mutations in transcribed regions.

Comments

To add or change attributes of the catalog, you can use function attr.
For example, attr(catalog, "abundance") <- custom.abundance.

Examples

dir <- c(system.file("extdata/Strelka-SBS-vcf",
                     package = "ICAMS"))
if (requireNamespace("BSgenome.Hsapiens.1000genomes.hs37d5", quietly = TRUE)) {
  catalogs <-
    StrelkaSBSVCFFilesToZipFile(dir,
                                zipfile = file.path(tempdir(), "test.zip"),
                                ref.genome = "hg19",
                                trans.ranges = trans.ranges.GRCh37,
                                region = "genome",
                                base.filename = "Strelka-SBS")
  unlink(file.path(tempdir(), "test.zip"))}

ICAMS documentation built on June 22, 2024, 6:47 p.m.