View source: R/shiny_related_functions.R
StrelkaIDVCFFilesToCatalog | R Documentation |
Create ID (small insertion and deletion) catalog from the Strelka ID VCFs
specified by files
StrelkaIDVCFFilesToCatalog(
files,
ref.genome,
region = "unknown",
names.of.VCFs = NULL,
flag.mismatches = 0,
return.annotated.vcfs = FALSE,
suppress.discarded.variants.warnings = TRUE
)
files |
Character vector of file paths to the Strelka ID VCF files. |
ref.genome |
A |
region |
A character string designating a genomic region;
see |
names.of.VCFs |
Optional. Character vector of names of the VCF files.
The order of names in |
flag.mismatches |
Deprecated. If there are ID variants whose |
return.annotated.vcfs |
Logical. Whether to return the annotated VCFs with additional columns showing mutation class for each variant. Default is FALSE. |
suppress.discarded.variants.warnings |
Logical. Whether to suppress warning messages showing information about the discarded variants. Default is TRUE. |
This function calls VCFsToIDCatalogs
A list of elements:
catalog
: The ID (small insertion and deletion) catalog with
attributes added. See as.catalog
for more details.
discarded.variants
: Non-NULL only if there are variants
that were excluded from the analysis. See the added extra column
discarded.reason
for more details.
annotated.vcfs
:
Non-NULL only if return.annotated.vcfs
= TRUE. A list of
data frames which contain the original VCF's ID mutation rows with three
additional columns seq.context.width
, seq.context
and
ID.class
added. The category assignment of each ID mutation in VCF can
be obtained from ID.class
column.
See https://github.com/steverozen/ICAMS/blob/master/data-raw/PCAWG7_indel_classification_2021_09_03.xlsx for additional information on ID (small insertion and deletion) mutation classification.
See the documentation for Canonicalize1Del
which first handles
deletions in homopolymers, then handles deletions in simple repeats with
longer repeat units, (e.g. CACACACA
, see
FindMaxRepeatDel
), and if the deletion is not in a simple
repeat, looks for microhomology (see FindDelMH
).
See the code for unexported function CanonicalizeID
and the functions it calls for handling of insertions.
In ID (small insertion and deletion) catalogs, deletion repeat sizes range from 0 to 5+, but for plotting and end-user documentation deletion repeat sizes range from 1 to 6+.
file <- c(system.file("extdata/Strelka-ID-vcf",
"Strelka.ID.GRCh37.s1.vcf",
package = "ICAMS"))
if (requireNamespace("BSgenome.Hsapiens.1000genomes.hs37d5", quietly = TRUE)) {
catID <- StrelkaIDVCFFilesToCatalog(file, ref.genome = "hg19",
region = "genome")}
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.