View source: R/extractSigsIndel.R
extractSigsIndel | R Documentation |
Extract indel signatures
extractSigsIndel(..., method = "CHORD")
extractSigsIndelPcawg(
vcf.file = NULL,
df = NULL,
output = "contexts",
sample.name = NULL,
ref.genome = DEFAULT_GENOME,
signature.profiles = INDEL_SIGNATURE_PROFILES,
verbose = F,
...
)
extractSigsIndelChord(
vcf.file = NULL,
df = NULL,
sample.name = NULL,
ref.genome = DEFAULT_GENOME,
output = "contexts",
indel.len.cap = 5,
n.bases.mh.cap = 5,
get.other.indel.allele = F,
keep.indel.types = c("del", "ins"),
verbose = F,
...
)
... |
Other arguments that can be passed to variantsFromVcf() |
method |
Can be 'CHORD' or 'PCAWG'. Indicates the indel context type to extract. |
vcf.file |
Path to the vcf file |
df |
A dataframe containing the columns: chrom, pos, ref, alt. Alternative input option to vcf.file |
output |
Output the absolute signature contributions (default, 'signatures'), indel contexts ('contexts'), or an annotated bed-like dataframe ('df') |
sample.name |
If a character is provided, the header for the output matrix will be named to this. If none is provided, the basename of the vcf file will be used. |
ref.genome |
A BSgenome reference genome. Default is BSgenome.Hsapiens.UCSC.hg19. If another reference genome is indicated, it will also need to be installed. |
verbose |
Print progress messages? |
indel.len.cap |
Specifies the max indel sequence length to consider when counting 'repeat' and 'none' contexts. Counts of longer indels will simply be binned to the counts of contexts at the max indel sequence length. |
n.bases.mh.cap |
Specifies the max bases in microhomology to consider when counting repeat and microhomology contexts. Counts of longer indels will simply be binned to the counts of contexts at the max indel sequence length. |
get.other.indel.allele |
Only applies when mode=='indel' For indels, some vcfs only report the sequence of one allele (REF for deletions and ALT for insertions). If TRUE, the unreported allele will be retrieved from the genome: a 5' base relative to the indel sequence. This base will also be added to the indel sequence and the POS will be adjusted accordingly (POS=POS-1). |
keep.indel.types |
A character vector of indel types to keep. Defaults to 'del' and 'ins' to filter out MNVs (variants where REF and ALT length >= 2). MNV names are: 'mnv_neutral' (REF lenth == ALT length), 'mnv_del' (REF length > ALT length), or 'mnv_ins' (REF length < ALT length). |
description |
Will return a 1-column matrix containing the absolute indel signature contributions (i.e. the number of mutations contributing to each mutational signature). Two sets of indel contexts can be used: CHORD and PCAWG. For CHORD indel contexts, signatures used are insertions/deletions within repeat regions (ins.rep, del.rep), insertions/deletions with flanking microhomology (ins.mh, del.mh), and insertions/deletions which don't fall under the previous 2 categories (ins.none, del.none). Each category is further stratified by the length of the indel. PCAWG indel contexts are described at: https://cancer.sanger.ac.uk/cosmic/signatures/ID/index.tt |
A 1-column matrix containing the context counts or signature contributions
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.