Home

/

GitHub

/

In xiaonui/metaFunc: An R package for comprehensive visualization of functional annotations by combining taxonomy

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

2021-04

MetaFunc is an R package for comprehensive visualization of functional annotations of microbiomes by combining their taxonomy information.

1. Introduction

metaFunc is mainly applied to display and interpret the functional annotation of metagenomic data. It will sort out the taxonomic profiling of all functional genes in a microbiome to get the community structure. Then, for each function, the corresponding genes will be grouped according to the taxonomic classification. Next, the number of genes in different samples will be calculated. Finally, the community structure and functions will be combined and showed in a complex combination block chart. The combination of them provides a full view that helps researchers gain actionable insights. metaFunc provides two usage modes: graphic interface and function call. The user-friendly graphic interface enables users to manipulate the data and customize plot charts.

2. Installing R/RStudio

metaFunc is a package in the R software environment, which can be freely downloaded as follows:

Install R
Install RStudio

3. Installation

Check or install required packages.

packages <- c("DT", "ggplot2", "ggrepel", "networkD3", "shiny")
lapply(packages, function(x) {
    if(!require(x, character.only = TRUE)) {
        install.packages(x, dependencies = TRUE)
    }})

Install metaFunc from github.

if (!requireNamespace("devtools", quietly = TRUE))
  install.packages("devtools")
library(devtools)
install_github("xiaonui/metaFunc", build_vignettes = TRUE)

4 Quick Start

Load the library

library(metaFunc)

Run MetaFunc using a graphic interface.

blockShiny()

You can also call the visualization function blockPlot() directly.

data(simple_demo)
blockPlot(func_data = simple_demo$func, tax_data = simple_demo$tax, gene_data = simple_demo$gene)

5 Input data format

Before starting, you need to prepare three files for functional annotation, taxonomic classification and gene profile.

The file for functional annotation should contain just two columns. The first column is the gene name, and the second column is the functional annotation. There should be no duplication of gene identities. If a gene corresponds to multiple functions, use a separator to connect them. The separator can be a semicolon, comma or slash. By default, it is a semicolon. The structure of the data is shown below.

library(metaFunc)
data(simple_demo)
functional_annotation <- simple_demo$func[21:30,]
rownames(functional_annotation) <- NULL
knitr::kable(rbind(functional_annotation, "..." = "..."))

The file for taxonomic classification should contain at least three columns. The first column is for the gene names, and the rest columns are for the taxonomic classification. There should be no duplicated gene names. Unknown taxonomic classification is labeled as Unknown. If a gene in a certain taxonomic rank is labeled as Unknown, the lower taxonomic rank(s) should be Unknown too. The structure of the data is shown below.

taxonomic_classification <- head(simple_demo$tax[, c(1,2,3,7)])
rownames(taxonomic_classification) <- NULL
knitr::kable(rbind(cbind(taxonomic_classification, "..." = "..."), "..." = "..."))

The file for gene profile should contain at least two columns, the first column is for the gene names, and the rest columns are gene abundances in different genomes, which is also called gene profile. There should be no duplicated gene names. 0 represents gene absence, and positive numbers mean the genes presented. The structure of the data is shown as below.

gene_profile <- simple_demo$gene[30:40, 1:4]
rownames(gene_profile) <- NULL

knitr::kable(rbind(cbind(gene_profile, "..." = "..."), "..." = "..."))

6. Run graphic visualization interface

The user-friendly interface contains three tabs: "Upload Data", "Overview", and "Combination Block Chart", which will be introduced in more details as follows.

6.1 Upload Data

To begin the analysis, you need to upload three files for functional annotation, taxonomic classification and gene_profile (in comma-separated (.csv) or tab-separated (.txt) format).

If you do not have these three files ready, you can use the demo data files by clicking the "Load Demo (Xiao L et al.)" button. The demo dataset contains genes in the non-redundant gene catalogs related to propionate metabolism. After the data files are uploaded and checked, they will be displayed, and the result tabs will automatically appear.

6.2 Data Visualization overview

At first, you will be greeted with a data summary section: a bar plot showing the functions and the number of corresponding genes. For each function, the total number and the number of genes with their taxa annotated are shown. The pdf format of this figure can be generated and downloaded by clicking "Download Plot".

If you select an area and double click that area on the plot, the chart will be zoomed to the selected area. If no area is selected again, double-click anywhere outside the area to reset the zoom.

If you selected an area in the bar plot, the corresponding data will be shown in the table below the bar plot. Functions of interest can be further selected and displayed on the right.

6.3 Combination Block Chart

For the selected functions listed in the table, the corresponding genes' data and their taxonomic classification will be extracted. A page will be loaded with a complex combination block chart. You can change the "Taxon Split Percentage" to adjust the taxonomic block. The pdf format file of this figure can be generated and downloaded by clicking "Download Plot".

You can hide the lollipop chart by adjusting the transparency (Fig 7).

Besides selecting an area and double-click to zoom, you can also click a point in the taxon-function block (Fig 8).

After clicking, the detailed information of genes corresponding to the point in the taxon-function block will be displayed as below (Fig 9).

The detailed taxonomic annotations of all the corresponding genes can be viewed in a table (Fig 10).

More detailed taxonomic classification can also be shown in a Sankey plot (Fig 11).

Reference:

Xiao L, Sonne SB, Feng Q, et al. High-fat feeding rather than obesity drives taxonomical and functional changes in the gut microbiota in mice. Microbiome. 2017;5(1):43.

xiaonui/metaFunc documentation built on April 9, 2021, 9:50 a.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

Tweet to @rdrrHQ

GitHub issue tracker

ian@mutexlabs.com