VennDetail An R package for visualizing and extracting details of multi-sets intersection

1. Introduction

Visualizing and extracting unique (disjoint) or overlapping subsets of multiple gene datasets are a frequently performed task for bioinformatics. Although various packages and web applications are available, no R package offering functions to extract and combine details of these subsets with user datasets in data frame is available. Moreover, graphical visualization is usually limited to six or less gene datasets and a novel method is required to properly show the subset details.We have developed VennDetail, an R package to generate high-quality Venn-Pie charts and to allow extraction of subset details from input datasets.

2. Software Usage

2.1 Installation

The package can be installed as `` {r install,eval = FALSE} if (!requireNamespace("BiocManager")) install.packages("BiocManager")BiocManager::install("VennDetail")

### 2.2 Data Input
```r
library(VennDetail)
data(T2DM)

T2DM data include three sets of differentially expressed genes (DEGs) from the publication by Hinder et al [1]. The three DEG datasets were obtained in three different tissues, kidney Cortex, kidney glomerula, and sciatic nerve, by comparing db/db diabetic mice and db/db mice with pioglitazone treatment. Differential expression was determined by using Cuffdiff with a false discovery rate (FDR) < 0.05.

2.3 Quick Tour

``` {r quick} ven <- venndetail(list(Cortex = T2DM$Cortex$Entrez, SCN = T2DM$SCN$Entrez, Glom = T2DM$Glom$Entrez))

_VennDetail_ supports three different types of Venn diagram display formats
```  {r fig1, fig.width = 6, fig.height = 5, fig.align = "center"}
##traditional venn diagram
plot(ven)

``` {r fig2, fig.width = 6, fig.height = 5, fig.align = "center"}

Venn-Pie format

plot(ven, type = "vennpie")

```  {r fig3, fig.width = 6, fig.height = 5, fig.align = "center"}
##Upset format
plot(ven, type = "upset")

2.4 Main Functions

-- venndetail uses a list of vectors as input to construct the shared or disjoint subsets Venn object. venndetail accepts a list of vector as input and returns a Venn object for the following analysis. Users can also use merge function to merge two Venn objects together to save time.

-- plot generates figures with different layouts with type parameter. plot function also provides lots of parameters for users to modify the figures.

-- getSet function provides a way to extract subsets from the main result along with any available annotations. The parameter subset asks the users to give the subset names to extract. It accepts a vector of subset names. Here, we will show how the DEGs shared by all three tissues as well as those that are only included by SCN tissue can be extracted.

## List the subsets name
detail(ven) 
head(getSet(ven, subset = c("Shared", "SCN")), 10)

-- result function can be used to extract and export all of the subsets for further processing. We currently support two different formats of result (long and wide formats).

## long format: the first column lists the subsets name, and the second column
## shows the genes included in the subsets
head(result(ven))
## wide format: the first column lists all the genes, the following columns
## display the groups name (three tissues) and the last column is the total 
## number of the gene shared by groups.
head(result(ven, wide = TRUE))

-- vennpie creates a Venn-pie diagram with unique or common subsets in multiple ways such as highlighting unique or shared subsets. The following example illustrates how to show the unique subsets on the venn-pie plots.

vennpie(ven, any = 1, revcolor = "lightgrey")

The parameters any and group provide two different ways to highlight the subsets. any determines the subsets to show up in the number of groups (1: those included in just one group; 2: those shared by any two groups). group asks users to specify the subsets to be highlighted. Users may check the sets name by using detail function. Since the example datasets used in this vignette include only a small number of shared genes all across three sets (n=8), it may be a little hard to see the shared subset (grey), particularly in the Cortex group (the inner-most circle). .

vennpie(ven, log = TRUE)

When we have five datasets, we can use vennpie to show the sets include elements from at least four datasets. Below show the reults with five datasets as input.

set.seed(123)
A <- sample(1:1000, 400, replace = FALSE)
B <- sample(1:1000, 600, replace = FALSE)
C <- sample(1:1000, 350, replace = FALSE)
D <- sample(1:1000, 550, replace = FALSE)
E <- sample(1:1000, 450, replace = FALSE)
venn <- venndetail(list(A = A, B = B, C= C, D = D, E = E))
vennpie(venn, min = 4)

-- getFeature allows users to combine the details of any or all subsets from the main result with users’ other datasets, containing a list of data frames, and to export the combined data as a data frame. In the following example, we will demonstrate how to add other available annotation in the input data (T2DM) such as log2FC and FDR values for the shared genes among these three tissues.

head(getFeature(ven, subset = "Shared", rlist = T2DM))

-- dplot shows the details of these subsets with bar-plot.

dplot(ven, order = TRUE, textsize = 4)

2.5 Shiny web app

A shiny web application is here: VennDetail Note: Only support five input datasets now

3 Contact information

For any questions please contact [email protected]

4 Reference

[1] Hinder LM, Park M, Rumora AE, Hur J, Eichinger F, Pennathur S, Kretzler M, Brosius FC 3rd, Feldman EL.Comparative RNA-Seq transcriptome analyses reveal distinct metabolic pathways in diabetic nerve and kidney disease. J Cell Mol Med. 2017 Sep;21(9):2140-2152. doi: 10.1111/jcmm.13136. Epub 2017 Mar 8.



guokai8/VennDetail documentation built on Dec. 29, 2019, 6:04 a.m.