knitr::opts_chunk$set( collapse = TRUE, comment = "#>", dev = "CairoPNG", fig.path = "man/figures/README-", out.width = "100%" ) scholarUrl <- "https://scholar.google.com/scholar?oi=bibs&hl=en&cites=5439940108464463894" scholarPage <- rvest::read_html(scholarUrl) citeElem <- rvest::html_elements(scholarPage, "#gs_ab_md .gs_ab_mdw") citations <- paste0("~", stringr::str_extract(rvest::html_text(citeElem), "[0-9]{1,3}"))
:package: microViz
is an R package for analysis and visualization of microbiome sequencing data.
:hammer: microViz
functions are intended to be beginner-friendly but flexible.
:microscope: microViz
extends or complements popular microbial ecology packages, including phyloseq
, vegan
, & microbiome
.
:paperclip: This website is the best place for documentation and examples: https://david-barnett.github.io/microViz/
This ReadMe shows a few example analyses
The Getting Started guide shows more example analyses and gives advice on using microViz with your own data
The Reference page lists all functions and links to help pages and examples
The News page describes important changes in new microViz package versions
The Articles pages give tutorials and further examples
Post on GitHub discussions if you have questions/requests
microViz is not (yet) available from CRAN. You can install microViz from R Universe, or from GitHub.
I recommend you first install the Bioconductor dependencies using the code below.
if (!requireNamespace("BiocManager", quietly = TRUE)) install.packages("BiocManager") BiocManager::install(c("phyloseq", "microbiome", "ComplexHeatmap"), update = FALSE)
install.packages( "microViz", repos = c(davidbarnett = "https://david-barnett.r-universe.dev", getOption("repos")) )
I also recommend you install the following suggested CRAN packages.
install.packages("ggtext") # for rotated labels on ord_plot() install.packages("ggraph") # for taxatree_plots() install.packages("DT") # for tax_fix_interactive() install.packages("corncob") # for beta binomial models in tax_model()
# Installing from GitHub requires the remotes package install.packages("remotes") # Windows users will also need to have RTools installed! http://jtleek.com/modules/01_DataScientistToolbox/02_10_rtools/ # To install the latest version: remotes::install_github("david-barnett/microViz") # To install a specific "release" version of this package, e.g. an old version remotes::install_github("david-barnett/microViz@0.12.0")
:apple: macOS users - might need to install xquartz to make the heatmaps work (to do this with homebrew, run the following command in your mac's Terminal: brew install --cask xquartz
:package: I highly recommend using renv for managing your R package installations across multiple projects.
:whale: For Docker users an image with the main branch installed is available at: https://hub.docker.com/r/barnettdavid/microviz-rocker-verse
:date: microViz is tested to work with R version 4.* on Windows, MacOS, and Ubuntu 20. R version 3.6.* should probably work, but I don't fully test this.
library(microViz)
microViz provides a Shiny app for an easy way to start exploring your microbiome data: all you need is a phyloseq object.
# example data from corncob package pseq <- microViz::ibd %>% tax_fix() %>% phyloseq_validate()
ord_explore(pseq) # gif generated with microViz version 0.7.4 (plays at 1.75x speed)
library(phyloseq) library(dplyr) library(ggplot2)
# get some example data data("dietswap", package = "microbiome") # create a couple of numerical variables to use as constraints or conditions dietswap <- dietswap %>% ps_mutate( weight = recode(bmi_group, obese = 3, overweight = 2, lean = 1), female = if_else(sex == "female", true = 1, false = 0), african = if_else(nationality == "AFR", true = 1, false = 0) ) # add a couple of missing values to show how microViz handles missing data sample_data(dietswap)$african[c(3, 4)] <- NA
You have quite a few samples in your phyloseq object, and would like to visualize their compositions. Perhaps these example data differ by participant nationality?
dietswap %>% comp_barplot( tax_level = "Genus", n_taxa = 15, other_name = "Other", taxon_renamer = function(x) stringr::str_remove(x, " [ae]t rel."), palette = distinct_palette(n = 15, add = "grey90"), merge_other = FALSE, bar_outline_colour = "darkgrey" ) + coord_flip() + facet_wrap("nationality", nrow = 1, scales = "free") + labs(x = NULL, y = NULL) + theme(axis.text.y = element_blank(), axis.ticks.y = element_blank())
htmp <- dietswap %>% ps_mutate(nationality = as.character(nationality)) %>% tax_transform("log2", add = 1, chain = TRUE) %>% comp_heatmap( taxa = tax_top(dietswap, n = 30), grid_col = NA, name = "Log2p", taxon_renamer = function(x) stringr::str_remove(x, " [ae]t rel."), colors = heat_palette(palette = viridis::turbo(11)), row_names_side = "left", row_dend_side = "right", sample_side = "bottom", sample_anno = sampleAnnotation( Nationality = anno_sample_cat( var = "nationality", col = c(AAM = "grey35", AFR = "grey85"), box_col = NA, legend_title = "Nationality", size = grid::unit(4, "mm") ) ) ) ComplexHeatmap::draw( object = htmp, annotation_legend_list = attr(htmp, "AnnoLegends"), merge_legends = TRUE )
Ordination methods can also help you to visualize if overall microbial ecosystem composition differs markedly between groups, e.g. BMI.
Here is one option as an example:
tax_agg()
tax_transform()
ord_calc()
ord_plot()
# perform ordination unconstrained_aitchison_pca <- dietswap %>% tax_agg("Family") %>% tax_transform("clr") %>% ord_calc() # ord_calc will automatically infer you want a "PCA" here # specify explicitly with method = "PCA", or you can pick another method # create plot pca_plot <- unconstrained_aitchison_pca %>% ord_plot( plot_taxa = 1:6, colour = "bmi_group", size = 1.5, tax_vec_length = 0.325, tax_lab_style = tax_lab_style(max_angle = 90, aspect_ratio = 1), auto_caption = 8 ) # customise plot customised_plot <- pca_plot + stat_ellipse(aes(linetype = bmi_group, colour = bmi_group), linewidth = 0.3) + # linewidth not size, since ggplot 3.4.0 scale_colour_brewer(palette = "Set1") + theme(legend.position = "bottom") + coord_fixed(ratio = 1, clip = "off") # makes rotated labels align correctly # show plot customised_plot
You visualised your ordinated data in the plot above. Below you can see how to perform a PERMANOVA to test the significance of BMI's association with overall microbial composition. This example uses the Family-level Aitchison distance to correspond with the plot above.
# calculate distances aitchison_dists <- dietswap %>% tax_transform("identity", rank = "Family") %>% dist_calc("aitchison") # the more permutations you request, the longer it takes # but also the more stable and precise your p-values become aitchison_perm <- aitchison_dists %>% dist_permanova( seed = 1234, # for set.seed to ensure reproducibility of random process n_processes = 1, n_perms = 99, # you should use at least 999! variables = "bmi_group" ) # view the permanova results perm_get(aitchison_perm) %>% as.data.frame() # view the info stored about the distance calculation info_get(aitchison_perm)
You could visualise the effect of the (numeric/logical) variables in your permanova directly using the ord_plot
function with constraints (and conditions).
perm2 <- aitchison_dists %>% dist_permanova(variables = c("weight", "african", "sex"), seed = 321)
We'll visualise the effect of nationality and bodyweight on sample composition, after first removing the effect of sex.
perm2 %>% ord_calc(constraints = c("weight", "african"), conditions = "female") %>% ord_plot( colour = "nationality", size = 2.5, alpha = 0.35, auto_caption = 7, constraint_vec_length = 1, constraint_vec_style = vec_constraint(1.5, colour = "grey15"), constraint_lab_style = constraint_lab_style( max_angle = 90, size = 3, aspect_ratio = 0.8, colour = "black" ) ) + stat_ellipse(aes(colour = nationality), linewidth = 0.2) + scale_color_brewer(palette = "Set1", guide = guide_legend(position = "inside")) + coord_fixed(ratio = 0.8, clip = "off", xlim = c(-4, 4)) + theme(legend.position.inside = c(0.9, 0.1), legend.background = element_rect())
microViz heatmaps are powered by ComplexHeatmap
and annotated with taxa prevalence and/or abundance.
# set up the data with numerical variables and filter to top taxa psq <- dietswap %>% ps_mutate( weight = recode(bmi_group, obese = 3, overweight = 2, lean = 1), female = if_else(sex == "female", true = 1, false = 0), african = if_else(nationality == "AFR", true = 1, false = 0) ) %>% tax_filter( tax_level = "Genus", min_prevalence = 1 / 10, min_sample_abundance = 1 / 10 ) %>% tax_transform("identity", rank = "Genus") # randomly select 30 taxa from the 50 most abundant taxa (just for an example) set.seed(123) taxa <- sample(tax_top(psq, n = 50), size = 30) # actually draw the heatmap cor_heatmap( data = psq, taxa = taxa, taxon_renamer = function(x) stringr::str_remove(x, " [ae]t rel."), tax_anno = taxAnnotation( Prev. = anno_tax_prev(undetected = 50), Log2 = anno_tax_box(undetected = 50, trans = "log2", zero_replace = 1) ) )
:innocent: If you find any part of microViz useful to your work, please consider citing the JOSS article:
Barnett et al., (2021). microViz: an R package for microbiome data visualization and statistics. Journal of Open Source Software, 6(63), 3201, https://doi.org/10.21105/joss.03201
Bug reports, questions, suggestions for new features, and other contributions are all welcome. Feel free to create a GitHub Issue or write on the Discussions page.
This project is released with a Contributor Code of Conduct and by participating in this project you agree to abide by its terms.
sessionInfo()
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.