knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
2021-04
MetaFunc is an R package for comprehensive visualization of functional annotations of microbiomes by combining their taxonomy information.
metaFunc is mainly applied to display and interpret the functional annotation of metagenomic data. It will sort out the taxonomic profiling of all functional genes in a microbiome to get the community structure. Then, for each function, the corresponding genes will be grouped according to the taxonomic classification. Next, the number of genes in different samples will be calculated. Finally, the community structure and functions will be combined and showed in a complex combination block chart. The combination of them provides a full view that helps researchers gain actionable insights. metaFunc provides two usage modes: graphic interface and function call. The user-friendly graphic interface enables users to manipulate the data and customize plot charts.
metaFunc is a package in the R software environment, which can be freely downloaded as follows:
Check or install required packages.
packages <- c("DT", "ggplot2", "ggrepel", "networkD3", "shiny") lapply(packages, function(x) { if(!require(x, character.only = TRUE)) { install.packages(x, dependencies = TRUE) }})
Install metaFunc from github.
if (!requireNamespace("devtools", quietly = TRUE)) install.packages("devtools") library(devtools) install_github("xiaonui/metaFunc", build_vignettes = TRUE)
Load the library
library(metaFunc)
Run MetaFunc using a graphic interface.
blockShiny()
You can also call the visualization function blockPlot() directly.
data(simple_demo) blockPlot(func_data = simple_demo$func, tax_data = simple_demo$tax, gene_data = simple_demo$gene)
Before starting, you need to prepare three files for functional annotation
, taxonomic classification
and gene profile
.
The file for functional annotation
should contain just two columns. The first column is the gene name, and the second column is the functional annotation. There should be no duplication of gene identities. If a gene corresponds to multiple functions, use a separator to connect them. The separator can be a semicolon, comma or slash. By default, it is a semicolon. The structure of the data is shown below.
library(metaFunc) data(simple_demo) functional_annotation <- simple_demo$func[21:30,] rownames(functional_annotation) <- NULL knitr::kable(rbind(functional_annotation, "..." = "..."))
The file for taxonomic classification
should contain at least three columns. The first column is for the gene names, and the rest columns are for the taxonomic classification. There should be no duplicated gene names. Unknown taxonomic classification is labeled as Unknown
. If a gene in a certain taxonomic rank is labeled as Unknown
, the lower taxonomic rank(s) should be Unknown
too. The structure of the data is shown below.
taxonomic_classification <- head(simple_demo$tax[, c(1,2,3,7)]) rownames(taxonomic_classification) <- NULL knitr::kable(rbind(cbind(taxonomic_classification, "..." = "..."), "..." = "..."))
The file for gene profile
should contain at least two columns, the first column is for the gene names, and the rest columns are gene abundances in different genomes, which is also called gene profile. There should be no duplicated gene names. 0
represents gene absence, and positive numbers mean the genes presented. The structure of the data is shown as below.
gene_profile <- simple_demo$gene[30:40, 1:4] rownames(gene_profile) <- NULL knitr::kable(rbind(cbind(gene_profile, "..." = "..."), "..." = "..."))
The user-friendly interface contains three tabs: "Upload Data", "Overview", and "Combination Block Chart", which will be introduced in more details as follows.
To begin the analysis, you need to upload three files for functional annotation
, taxonomic classification
and gene_profile
(in comma-separated (.csv) or tab-separated (.txt) format).
Fig 1. Graphic user interface for input data uploading
If you do not have these three files ready, you can use the demo data files by clicking the "Load Demo (Xiao L et al.)" button. The demo dataset contains genes in the non-redundant gene catalogs related to propionate metabolism. After the data files are uploaded and checked, they will be displayed, and the result tabs will automatically appear.
Fig 2. Successfully uploaded data files
At first, you will be greeted with a data summary section: a bar plot showing the functions and the number of corresponding genes. For each function, the total number and the number of genes with their taxa annotated are shown. The pdf format of this figure can be generated and downloaded by clicking "Download Plot".
Fig 3. Data summary section
If you select an area and double click that area on the plot, the chart will be zoomed to the selected area. If no area is selected again, double-click anywhere outside the area to reset the zoom.
Fig 4. Zoom the plot
If you selected an area in the bar plot, the corresponding data will be shown in the table below the bar plot. Functions of interest can be further selected and displayed on the right.
Fig 5. Data selection
For the selected functions listed in the table, the corresponding genes' data and their taxonomic classification will be extracted. A page will be loaded with a complex combination block chart. You can change the "Taxon Split Percentage" to adjust the taxonomic block. The pdf format file of this figure can be generated and downloaded by clicking "Download Plot".
Fig 6. The results of the same data under different parameters (tax split percentage). The split percentage was 10% for the upper panel and 60% for the lower panel.
You can hide the lollipop chart by adjusting the transparency (Fig 7).
Fig 7. Combination block chart
Besides selecting an area and double-click to zoom, you can also click a point in the taxon-function block (Fig 8).
Fig 8. Click the point
After clicking, the detailed information of genes corresponding to the point in the taxon-function block will be displayed as below (Fig 9).
Fig 9. The detailed information of genes (pointed by the red arrow in Fig 8)
The detailed taxonomic annotations of all the corresponding genes can be viewed in a table (Fig 10).
Fig 10. The data of genes
More detailed taxonomic classification can also be shown in a Sankey plot (Fig 11).
Fig 11. A Sankey plot
Xiao L, Sonne SB, Feng Q, et al. High-fat feeding rather than obesity drives taxonomical and functional changes in the gut microbiota in mice. Microbiome. 2017;5(1):43.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.