knitr::opts_chunk$set( collapse = TRUE, comment = "#>", echo = TRUE, message = FALSE, warning = FALSE, fig.width=8, fig.height=5 )
Please post all the questions or queries related to MAFDash package on Github Issues. This will help us to build an information repository which can be used by other users.
install.packages(c("dplyr","ensurer","ggplot2","tidyr","DT","rmarkdown","knitr","flexdashboard","htmltools","data.table","ggbeeswarm","RColorBrewer","plotly","circlize","canvasXpress","crosstalk","bsplus","BiocManager","maftools","ComplexHeatmap")) BiocManager::install(c("TCGAbiolinks")) install.packages(devtools) library(devtools) devtools::install_github("ashishjain1988/MAFDashRPackage")
Mutation Annotation Format (MAF) is a tabular data format used for storing genetic mutation data. For example, The Cancer Genome Atlas (TCGA) project has made MAF files from each project publicly available.
The package -- MAFDash -- contains a set of R tools to easily create an HTML dashboard to summarize and visualize data from MAF file.
The resulting HTML file serves as a self-contained report that can be used to explore the result. Currently, MAFDash produces mostly static plots powered by maftools, ComplexHeatmap and circlize, as well as interactive visualizations using canvasXpress and plotly. The report is generated with a parameterized R Markdown script that uses flexdashboard to arrange all the information.
Mutation Annotation Format (MAF) is a tabular data format used for storing genetic mutation data. For example, The Cancer Genome Atlas (TCGA) project has made MAF files from each project publicly available. The main function of MAFDash (
getMAFDashboard) creates an HTML dashboard to summarize and visualize data from MAF files. The resulting HTML file serves as a self-contained report that can be used to explore and share the results. The example below shows how we can create an HTML MAF dashboard file. The first argument of
getMAFDashboard can be anything that's accepted by maftools's
read.maf function (path to a file, or a
library(MAFDash) maf <- system.file("extdata", "test.mutect2.maf.gz", package = "MAFDash") getMAFDashboard(maf, outputFileName="output", outputFileTitle=paste0("MAF Dashboard - TCGA-",CancerCode),outputFilePath = tempdir(), )
MAFDash also provides a wrapper function
getMAFdataTCGA around the
TCGABiolinks, which returns the mutation data of different cancers in MAF format from TCGA website. See this page for a list of TCGA codes.
library("MAFDash") # Download MAF data from TCGA CancerCode <- "ACC" inputFolderPath <- paste0(tempdir(),"/tcga_data") ## This folder will be created if it doesn't exist maf <- getMAFdataTCGA(cancerCode = CancerCode, outputFolder = inputFolderPath)
The oncoplot shows the number and types of mutations in a set of genes across the samples. The function
generateOncoPlot can be used to generate the oncoplot.
library(MAFDash) library(maftools) maf <- system.file("extdata", "test.mutect2.maf.gz", package = "MAFDash") generateOncoPlot(read.maf(maf))
The burdenplot compares the total number of mutations between the samples using a dotplot. The figure also have a barplot showing the distribution of different type of mutations across the samples using a barplot.
library(MAFDash) library(maftools) maf <- system.file("extdata", "test.mutect2.maf.gz", package = "MAFDash") generateBurdenPlot(read.maf(maf), plotType="Dotplot") generateBurdenPlot(read.maf(maf), plotType="Barplot")
This function generates silent and non-silent mutation plot using the MAF data.
library(MAFDash) library(maftools) maf <- system.file("extdata", "test.mutect2.maf.gz", package = "MAFDash") generateMutationTypePlot(read.maf(maf))
getMAFDashboard() function will accept a named list for adding arbitrary objects to the dashboard. Each item in the list will be displayed in separate tabs, and the name of the element will be used as the title of the tab.
Elements of the list can be:
This functionality can be used with or without providing a MAF file. When MAF data is not provided, the "Variant Table" tab of the dashboard is automatically omitted.
library(ggplot2) library(plotly) library(ComplexHeatmap) data(iris) ## Simple ggplot myplot <- ggplot(iris) + geom_point(aes(x=Sepal.Length, y=Sepal.Width, color=Species)) ## Save as PNG (provide absolute file path) mycustomimage_png <- file.path(getwd(),"custom_ggplot.png") ggsave(mycustomimage_png, plot=myplot, width=5, height=4) ## Save as PDF (provide absolute file path) mycustomimage_pdf <- file.path(getwd(),"custom_ggplot.pdf") ggsave(mycustomimage_pdf, plot=myplot, width=5, height=4) ## Convert ggplot to plotly myplotly <- ggplotly(myplot) ## Make heatmap with ComplexHeatmap hmdata <- t(iris[,1:4]) hmanno <- HeatmapAnnotation(df=data.frame(Species=iris[,5])) myhm <- Heatmap(hmdata, bottom_annotation = hmanno) ## Customizable plotly from https://github.com/mtandon09/Dynamic_Plotly source("https://raw.githubusercontent.com/mtandon09/Dynamic_Plotly/master/make_cutomizable_plotly.R") custom_plotly <- make_customizable_plotly(iris) ## Put together objects/filepaths into a list toyplotlist <- list("ggplot"= myplot, "plotly"= myplotly, "PNG"= mycustomimage_png, "PDF"= mycustomimage_pdf, "ComplexHeatmap"= myhm, "Customizable"= custom_plotly ) ## Filename to output to html_filename="toy_dash.html" ## Render dashboard getMAFDashboard(plotList = toyplotlist, outputFileName = html_filename, outputFileTitle = "Iris")
Output The output can be seen here.
MAFDash provides a wrapper function that tries to simplify retrieving data using
TCGABiolinks. Valid project codes can be viewed by running
TCGABiolinks::getGDCprojects() and checking the "tumor" column.
library(MAFDash) library(TCGABiolinks) tcga_code = "UVM" ## Uveal Melanoma caller = "mutect2" title_label = paste0("TCGA-",tcga_code) maf_file <- getMAFdataTCGA(tcga_code,variant_caller = caller)
tcga_clinical <- TCGAbiolinks::GDCquery_clinic(project = paste0("TCGA-",tcga_code), type = "clinical") tcga_clinical$Tumor_Sample_Barcode <- tcga_clinical$submitter_id
filterMAF function can be used to filter the MAF data in various ways. Importantly, by default, it will remove commonly occurring mutations that are often considered to be false position ( FLAG genes )
filtered_mafdata <- filterMAF(maf_file)
The easiest way to add clinical annotations to the oncoplot is to add clinical data to the
clinical.data slot of a
MAF object before passing it to the
MAFDash also provides a function that defines reasonable colors for some common clinical annotations provided with TCGA datasets.
filtered_maf <- read.maf(filtered_mafdata, clinicalData = tcga_clinical) annotation_colors <- getTCGAClinicalColors(ageRange = range(tcga_clinical$age_at_diagnosis, na.rm=T))
add_clinical_annotations argument can be:
clinical.dataslot of the
MAFobject. Columns with all missing values are ignored. Maximum number of annotations plotted is 10 (first 10 non-empty columns of
custom_onco <- generateOncoPlot(filtered_maf, add_clinical_annotations = names(annotation_colors), clin_data_colors = annotation_colors)
A lot of
maftools's plots are base graphics, so they're drawn to a device and not returned. But we can simply save them to a file and provide the file path.
library(maftools) tcgacompare_file <- file.path(getwd(),"tcga_compare.png") png(tcgacompare_file,width=8, height=6, units="in", res=400) tcgaCompare(filtered_maf,tcga_capture_size = NULL) dev.off()
This function is built on top of
somaticInteractions() function. It's just a different way of visualizing co-occurence or mutual exclusivity between genes.
ribbonplot_file <- file.path(getwd(),"ribbon.pdf") generateRibbonPlot(filtered_maf,save_name = ribbonplot_file)
customplotlist <- list("summary_plot"=T, "burden"=T, "TCGA Comparison"=tcgacompare_file, "oncoplot"=T, "Annotated Oncoplot"=custom_onco ) ## Filename to output to; if output directory doesn't exist, it will be created html_filename=file.path("examples/TCGA-UVM.custom.mafdash.html") ## Render dashboard getMAFDashboard(MAFfilePath = filtered_maf, plotList = customplotlist, outputFileName = html_filename, outputFileTitle = "Customized Dashboard")
The output can be seen here.
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.