get_mutation_dictionary: Group and Filter Mutation Types

Description Usage Arguments Value Examples

View source: R/maf_to_tables.R

Description

A function to create a mutation dictionary to group and filter mutation types: this can be useful for computational practicality. It is often not practical to model each distinct mutation type together, so for practicality one may group multiple classes together (e.g. all indel mutations, all nonsynonymous mutations). Additionally, some mutation types may be excluded from modelling (for example, one may wish not to use synonymous mutations in the model fitting process).

Usage

1
2
3
4
5
6
get_mutation_dictionary(
  for_biomarker = "TIB",
  include_synonymous = TRUE,
  maf = NULL,
  dictionary = NULL
)

Arguments

for_biomarker

(string) Specify some standard groupings of mutation types, corresponding the the coarsest groupings of nonsynonymous mutations required to evaluate the biomarkers TMB and TIB. If "TMB", groups all nonsynonymous mutations together, if "TIB" groups indel mutations together and all other mutations together.

include_synonymous

(logical) Determine whether synonymous mutations should be included in the dictionary.

maf

(dataframe) An annotated mutation table containing the column 'Variant_Classification', only used to check if the dictionary specified does not contain all the variant types in your dataset.

dictionary

(character) Directly specify the dictionary, in the form of a vector of grouping values. The names of the vector should correspond to the set of variant classifications of interest in the mutation annotated file (MAF).

Value

A vector of characters, with values corresponding to the grouping labels for mutation types, and with names corresponding to the mutation types as they will be referred to in a mutation annotated file (MAF). See examples.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
# To understand the dictionary format, note that the following code
dictionary <- get_mutation_dictionary(for_biomarker = "TMB")
# is equivalent to
dictionary <- c(rep("NS",9), rep("S", 8))
names(dictionary) <- c('Missense_Mutation', 'Nonsense_Mutation',
'Splice_Site', 'Translation_Start_Site',
'Nonstop_Mutation', 'In_Frame_Ins',
'In_Frame_Del', 'Frame_Shift_Del',
'Frame_Shift_Ins', 'Silent',
'Splice_Region', '3\'Flank', '5\'Flank',
'Intron', 'RNA', '3\'UTR', '5\'UTR')
# where the grouping levels are chosen to be "NS" and "S" for
# nonsynonymous and synonymous mutations respectively.
# the code
dictionary <- get_mutation_dictionary(for_biomarker = "TIB", include_synonymous = FALSE)
# is equivalent to
dictionary <- dictionary <- c(rep("NS",7), rep("I", 2))
names(dictionary) <- c('Missense_Mutation', 'Nonsense_Mutation',
                       'Splice_Site', 'Translation_Start_Site',
                      'Nonstop_Mutation', 'In_Frame_Ins',
                      'In_Frame_Del', 'Frame_Shift_Del',
                      'Frame_Shift_Ins')
# where now "I" is used as a label to refer to indel mutations,
# and synonymous mutations are filtered out.

ICBioMark documentation built on Nov. 15, 2021, 5:09 p.m.