knitr::opts_chunk$set(echo = FALSE, fig.align="center", fig.show="hold" , out.width = "50%") 

Overview


FungiExpresZ is a browser based user interface (developed in R-shiny) to analyze and visualize gene expression data. It allows users to visualize their own gene expression data as well \~16,000 pre-processed SRA fungal gene expression data. Users can even merge their data with SRA data to perform combined analysis and visualizations. Just uploading gene expression matrix (.txt file where rows are genes and column are samples), a user can generate 12 different exploratory visualizations and 6 different GO visualizations. Optionally, a user can upload multiple gene groups and sample groups to compare between them. A user can select set of genes directly from one of the scatter plot, line plot or heatmap and pass them for GO analysis and GO visualizations. GO analysis and GO visualizations can be done for more than 100 different fungal species, which have been implemented through popular R package ClusterProfiler [@Yu2012].

Key features


About 16,000 NCBI-SRA data from 18 different fungal species

FungiExpresZ provides normalized gene expression values (FPKM) for \~16,000 SRA samples. A user can select one or more SRA samples for visualizations. SRA data can be searched based on species, genotype, strain or free text, which will be matched against several SRA columns to find relevant hits.

| Species | SRA sample counts | |:------------------------------------------------|------------------:| | Aspergillus flavus NRRL3357 | 348 | | Aspergillus fumigatus Af293 | 242 | | Aspergillus nidulans FGSC A4 | 151 | | Aspergillus niger CBS 513.88 | 253 | | Botrytis cineria B05.10 | 142 | | Candida albicans SC5314 | 639 | | Candida auris B8 441 | 46 | | Candida glabrata CBS 138 | 126 | | Cryptococcus gattii NT-10 | 11 | | Cryptococcus neoformans var. neoformans JEC21 | 385 | | Fusarium fujikuroi IMI 58289 | 74 | | Fusarium graminearum PH-1 | 344 | | Fusarium oxysporum Fo47 | 131 | | Fusarium proliferatum ET1 | 4 | | Fusarium verticillioides 7600 | 179 | | Saccharomyces cerevisiae | 11872 | | Talaromyces marneffei ATCC 18224 | 26 | | Ustilago maydis 521 | 71 |

NOTE: We are continuously processing fungal SRA data. This table will be updated as we add new data.

Visualize user supplied gene expression data with or without integration of SRA data

Users can analyze and visualize their own data by uploading .txt/.csv file (columns are samples and rows are genes). Optionally, a user data can be integrated with selected SRA data for combined analysis and visualization.

Visualize multiple gene groups and sample groups in a single plot

Optionally, user can upload sample groups (e.g. replicates, control vs treatment, wild type vs deletion etc.) and multiple gene groups to compare between them. Group information uploaded once, can be used across several plots against fill and facet plot attributes to make more complex visualizations.

Twelve different data exploratory visualizations

FungiExpresZ provides browser based user friendly interface, which allows users to generate ggplot2 based 12 different publication-ready elegant visualizations. Users are allowed to adjust several common plot attributes such as plot title, axis title, font size, plot theme, legend size, legend position etc. and few other plot specific attributes. Currently, available plots are ...

  1. Scatter Plot\
  2. Multi-Scatter Plot\
  3. Corr Heat Box
  4. Density Plot
  5. Histogram
  6. Joy Plot
  7. Box Plot
  8. Violin Plot
  9. Bar Plot
  10. PCA Plot
  11. Line Plot\
  12. Heatmap

Supports Gene Ontology (GO) enrichment and visualizations for more than 100 different fungal species

FungiExpresZ allow users to define gene-set(s) directly from plot (Scatter plot, Line plot and Heatmap) to perform gene ontology enrichment and visualizations. Available GO visualizations are ...

  1. Emap plot
  2. Cnet plot
  3. Dot plot
  4. Bar plot
  5. Heat plot
  6. Upset plot

Installation


FungiExpresZ can also be installed locally as an R package or docker image. Please follow the instructions given on github{target="_blank"} to install on local computer.

Example data


We have used cartoon gene expression data to generate plots given in this document.

Expression matrix

Expression matrix can be downloaded from the file given here [Download{target="_blank"}]. It contains 4 samples each with 3 replicates. Column names and their description have been given in the table below.

| Column names | Description | |:--------------|-----------------------------------------:| | gene_id | Gene id, unique to each row | | Control_Rep.A | Normalised FPKM values | | Control_Rep.B | Normalised FPKM values | | Control_Rep.C | Normalised FPKM values | | Treat1_Rep.A | Normalised FPKM values | | Treat1_Rep.B | Normalised FPKM values | | Treat1_Rep.C | Normalised FPKM values | | Treat2_Rep.A | Normalised FPKM values | | Treat2_Rep.B | Normalised FPKM values | | Treat2_Rep.C | Normalised FPKM values | | Treat3_Rep.A | Normalised FPKM values | | Treat3_Rep.B | Normalised FPKM values | | Treat3_Rep.C | Normalised FPKM values | | Control_Mean | Mean FPKM of control replicates A,B,C | | Treat1_Mean | Mean FPKM of Treatment1 replicates A,B,C | | Treat1_Mean | Mean FPKM of Treatment2 replicates A,B,C | | Treat1_Mean | Mean FPKM of Treatment3 replicates A,B,C | | fc_treat1 | log2FC(Treat1_Mean/Control1_Mean) | | fc_treat2 | log2FC(Treat2_Mean/Control2_Mean) | | fc_treat3 | log2FC(Treat3_Mean/Control3_Mean) |

Sample groups

Sample group file contains two columns.

| Columns | Description | |:--------------|--------------------------------------------------------------------------------------------------------------------------------------------------:| | group_name | User given name to each sample (column) group. Values in this column can be redundant. | | group_members | Values in this column must be from column names given as sample identity in the expression matrix file. Each value must be unique in this column. |

In here, we have grouped samples by replicates. File can be downloaded from here [Download{target="_blank"}].

Gene groups

Gene group file contains two columns.

| Columns | Description | |:--------------|---------------------------------------------------------------------------------------------------------------------------------------------------:| | group_name | User given name to each gene (row) group. Values in this column can be redundant. | | group_members | Values in this column must be from the first column given as row identity in the expression matrix file. Each value must be unique in this column. |

In here, we have grouped genes by ...

1) Fold change comparison - Treatment(1,2 or 3)/Control.

2) Fold change status in two different comparisons.

Gene groups files can be downloaded from the links given in the table below.

| Gene group files | Description | |:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------:|:-------------------------------------------------------------------:| | gene group file 1 [Download{target="_blank"}] | gene groups by fold change Treat1/control | | gene group file 2 [Download{target="_blank"}] | gene groups by fold change Treat2/control | | gene group file 3 [Download{target="_blank"}] | gene groups by fold change Treat3/control | | gene group file 4 [Download{target="_blank"}] | gene groups by fold status Treat1/control vs Treat2/control | | gene group file 5 [Download{target="_blank"}] | gene groups by fold status Treat2/control vs Treat3/control | | gene group file 6 [Download{target="_blank"}] | gene groups by fold status Treat1/control vs Treat3/control |

Exploratory example plots

By uploading data files (given above) to the FungiExpresZ, plots below can be generated.

Scatter plot

Scatter plot can be used to display pairwise correlation between 2 samples. User can color dots either by density (default) or gene groups.

knitr::include_graphics(c("_about_files/example_plots/scatter_1.png" ,
                        "_about_files/example_plots/scatter_3.png"))

Multi-Scatter plot

Multi-scatter plot can be used to display pairwise correlation between more than 2 samples (Recommended to show correlation between replicate samples). The lower half of the plot represents scatter plot while upper half represents correlation values. Plot diagonal displays distribution of each sample in form of density plot. As the sample number increases, total number of plots increases exponentially in a single graphical device, which makes image busy and less interpretable. Therefore, we restrict user to include maximum 5 samples in one multi-scatter plot. Correlation heat-box is an alternative to show correlation in form of heat map for more than 5 samples.

knitr::include_graphics(c("_about_files/example_plots/multi_scatter_plot.png"))

CorrHeatBox

CorrHeatBox is useful to display pairwise correlation in form of heatmap.

knitr::include_graphics(c("_about_files/example_plots/corr_heatbox_1.png",
                          "_about_files/example_plots/corr_heatbox_2.png"
                          ))
knitr::include_graphics(c("_about_files/example_plots/corr_heatbox_4.png",
                          "_about_files/example_plots/corr_heatbox_5.png"
                          ))

Density plot

Density plot can be used to display distribution of individual sample, sample groups or gene groups.

knitr::include_graphics(c("_about_files/example_plots/density_plot_1.png","_about_files/example_plots/density_plot_3.png"))
knitr::include_graphics(c("_about_files/example_plots/density_plot_2.png",
                          "_about_files/example_plots/density_plot_5.png"
                          ))

Histogram

Histogram can be used to display frequency count of individual sample, sample groups or gene groups.

knitr::include_graphics(c("_about_files/example_plots/histogram_1.png" , 
                          "_about_files/example_plots/histogram_2.png"))
knitr::include_graphics(c("_about_files/example_plots/histogram_4.png" , 
                          "_about_files/example_plots/histogram_3.png"))

Joy plot

Joy plot can be used to display distribution of individual sample, sample groups or gene groups. By separating multiple variables on Y axis, it overcome the limitation of normal density plot.

knitr::include_graphics(c("_about_files/example_plots/joy_plot_1.png" , 
                          "_about_files/example_plots/joy_plot_4.png"))
knitr::include_graphics(c("_about_files/example_plots/joy_plot_2.png",
                          "_about_files/example_plots/joy_plot_3.png"
                          ))

Box plot

Boxplot can be used to display distribution of each observation and quantiles from individual sample, sample groups or gene groups.

knitr::include_graphics(c("_about_files/example_plots/box_plot_3.png" ,
                          "_about_files/example_plots/box_plot_2.png"
                          ))
knitr::include_graphics (c("_about_files/example_plots/box_plot_1.png",
                         "_about_files/example_plots/box_plot_4.png"))

Violin plot

Similar to box plot, violin plot, can be used to display distribution of each observation and quantiles from individual sample, sample groups or gene groups.

knitr::include_graphics(c("_about_files/example_plots/violin_plot_1.png" ,
                          "_about_files/example_plots/violin_plot_2.png" 
                          ))
knitr::include_graphics(c("_about_files/example_plots/violin_plot_3.png",
                          "_about_files/example_plots/violin_plot_4.png"))

Bar plot

Bar plot can be used to display expression of individual genes across multiple samples, sample groups and gene groups.

knitr::include_graphics(c("_about_files/example_plots/barplot_1.png",
                          "_about_files/example_plots/barplot_3.png" 
                           ))

PCA plot

PCA plot can be used to display similarity and differences between samples and sample groups using principle components

knitr::include_graphics(c("_about_files/example_plots/pca_plot_1.png" ,
                          "_about_files/example_plots/pca_plot_2.png"
                          ))

Line plot

Line plot can be used to display genes' trend across multiple samples. User can group observations either by k-means or pre defined gene groups.

knitr::include_graphics(c("_about_files/example_plots/line_plot_1.png" , 
                          "_about_files/example_plots/line_plot_2.png"))

Heatmap

Heatmap can be used to display genes' trend across multiple samples. User can group genes and samples either by k-means or pre defined gene groups or sample groups.

knitr::include_graphics(c("_about_files/example_plots/heatmap_1.png" , 
                          "_about_files/example_plots/heatmap_2.png"))
knitr::include_graphics(c("_about_files/example_plots/heatmap_3.png" , 
                          "_about_files/example_plots/heatmap_5.png"))
knitr::include_graphics(c("_about_files/example_plots/heatmap_4.png" , 
                          "_about_files/example_plots/heatmap_6.png"))

GO example plots

User can select genes or gene clusters from one of the scatter plot, lineplot or heatmap and pass them to GO enrichment followed by GO visualizations. If data uploaded by user, geneIds (first column of the file) must match with the geneIds of the selected species.

GO dot plot

knitr::include_graphics(c("_about_files/example_plots/go_dotplot.png"))

GO bar plot

knitr::include_graphics(c("_about_files/example_plots/go_barplot.png"))

GO heat plot

knitr::include_graphics(c("_about_files/example_plots/go_heatplot.png"))

GO EMAP plot

knitr::include_graphics(c("_about_files/example_plots/go_emapplot.png"))

GO CNET plot

knitr::include_graphics(c("_about_files/example_plots/go_cnetplot.png"))

GO UpSet-plot

knitr::include_graphics(c("_about_files/example_plots/go_upsetplot.png"))

References



cparsania/FungiExpresZ documentation built on March 15, 2024, 5:48 p.m.