prepareMutSig: Prepares MAF file for MutSig analysis.

Description Usage Arguments Details Value Examples

View source: R/prepareMutSig.R

Description

Corrects gene names for MutSig compatibility.

Usage

1
prepareMutSig(maf, fn = NULL)

Arguments

maf

an MAF object generated by read.maf

fn

basename for output file. If provided writes MAF to an output file with the given basename.

Details

MutSig/MutSigCV is most widely used program for detecting driver genes. However, we have observed that covariates files (gene.covariates.txt and exome_full192.coverage.txt) which are bundled with MutSig have non-standard gene names (non Hugo_Symbols). This discrepancy between Hugo_Symbols in MAF and non-Hugo_symbols in covariates file causes MutSig program to ignore such genes. For example, KMT2D - a well known driver gene in Esophageal Carcinoma is represented as MLL2 in MutSig covariates. This causes KMT2D to be ignored from analysis and is represented as an insignificant gene in MutSig results. This function attempts to correct such gene symbols with a manually curated list of gene names compatible with MutSig covariates list.

Value

returns a MAF with gene symbols corrected.

Examples

1
2
3
laml.maf <- system.file("extdata", "tcga_laml.maf.gz", package = "maftools")
laml <- read.maf(maf = laml.maf)
prepareMutSig(maf = laml)

Example output

-Reading
-Validating
-Silent variants: 475 
-Summarizing
-Processing clinical data
--Missing clinical data
-Finished in 0.431s elapsed (0.410s cpu) 
Converting gene names for 1 variants from 1 genes
   Hugo_Symbol MutSig_Synonym N
1:    ARHGAP35          GRLF1 1
Original symbols are preserved under column OG_Hugo_Symbol.
      Hugo_Symbol Entrez_Gene_Id           Center NCBI_Build Chromosome
   1:      ABCA10          10349 genome.wustl.edu         37         17
   2:       ABCA4             24 genome.wustl.edu         37          1
   3:      ABCB11           8647 genome.wustl.edu         37          2
   4:       ABCC3           8714 genome.wustl.edu         37         17
   5:       ABCF1             23 genome.wustl.edu         37          6
  ---                                                                  
2203:      ZNF648         127665 genome.wustl.edu         37          1
2204:      ZNF721         170960 genome.wustl.edu         37          4
2205:     ZSCAN21           7589 genome.wustl.edu         37          7
2206:     ZSCAN5A          79149 genome.wustl.edu         37         19
2207:       GRLF1           2909 genome.wustl.edu         37         19
      Start_Position End_Position Strand Variant_Classification Variant_Type
   1:       67170917     67170917      +            Splice_Site          SNP
   2:       94490594     94490594      +      Missense_Mutation          SNP
   3:      169780250    169780250      +      Missense_Mutation          SNP
   4:       48760974     48760974      +      Missense_Mutation          SNP
   5:       30554429     30554429      +      Missense_Mutation          SNP
  ---                                                                       
2203:      182027086    182027086      +                 Silent          SNP
2204:         437878       437878      +                 Silent          SNP
2205:       99654638     99654638      +                 Silent          SNP
2206:       56734093     56734093      +                 Silent          SNP
2207:       47423101     47423101      +      Missense_Mutation          SNP
      Reference_Allele Tumor_Seq_Allele1 Tumor_Seq_Allele2 Tumor_Sample_Barcode
   1:                T                 T                 C         TCGA-AB-2988
   2:                C                 C                 T         TCGA-AB-2869
   3:                G                 G                 A         TCGA-AB-3009
   4:                C                 C                 T         TCGA-AB-2887
   5:                G                 G                 A         TCGA-AB-2920
  ---                                                                          
2203:                A                 A                 C         TCGA-AB-2915
2204:                T                 T                 C         TCGA-AB-2977
2205:                C                 C                 T         TCGA-AB-2858
2206:                T                 T                 C         TCGA-AB-2930
2207:                C                 C                 T         TCGA-AB-2950
      Protein_Change i_TumorVAF_WU i_transcript_name OG_Hugo_Symbol
   1:        p.K960R      45.66000       NM_080282.3         ABCA10
   2:       p.R1517H      38.12000       NM_000350.2          ABCA4
   3:       p.A1283V      46.97218       NM_003742.2         ABCB11
   4:       p.P1271S      56.41000       NM_003786.1          ABCC3
   5:        p.G658S      40.95000    NM_001025091.1          ABCF1
  ---                                                              
2203:         p.T20T      56.28000    NM_001009992.1         ZNF648
2204:        p.Q114Q      45.17000       NM_133474.2         ZNF721
2205:          p.T3T      21.36000       NM_145914.2        ZSCAN21
2206:        p.L202L      50.00000       NM_024303.1        ZSCAN5A
2207:        p.A390V      53.89000       NM_004491.4       ARHGAP35

maftools documentation built on Feb. 6, 2021, 2 a.m.