From the paper "Yeasts Dominate Soil Fungal Communities In Three Lowland Neotropical Rainforests"
The data can be downloaded at https://onlinelibrary.wiley.com/doi/abs/10.1111/1758-2229.12575
library(tidyverse) library(here) library(cowplot) theme_set(theme_cowplot()) devtools::load_all() doParallel::registerDoParallel(parallel::detectCores())
Downloading data is preferably automatic, but can be replaced with instructions on how to download the data yourself.
main_data_file <- here("local_data", "Dunthorn2017.txt") if(!file.exists(main_data_file)) { download.file("https://sfamjournals.onlinelibrary.wiley.com/action/downloadSupplement?doi=10.1111%2F1758-2229.12575&file=emi412575-sup-0001-suppinfo1.txt", main_data_file) }
The important part - read and format the data so that rows are observations (biological samples) and columns are OTUs. Also gather basic information about the sampples (I call this "mapping" but maybe not a great name).
otus_raw <- read.csv(main_data_file, stringsAsFactors = FALSE, sep = "\t") otu_tab <- otus_raw %>% select(B005_B006:T199_T200) %>% as.matrix() %>% t() otu_tab <- otu_tab[rowSums(otu_tab) > 2000 & rownames(otu_tab) != "B005_B006",] #Mapping hardcoded, could not bother parsing their file mapping <- tibble(Sample = rownames(otu_tab)) %>% mutate(Origin = if_else(grepl("^B", Sample), "Panama", if_else(grepl("^L", Sample), "Costa Rica", "Ecuador")))
A custom plot function that wraps plot_mds
and fixes some aesthetics that are reused - completely unnecessary, just convenient.
plot_mds_dunthorn <- function(base_mds, mapping, aligned_samples = NULL, show_paths = 0.5, ...) { plot_mds(base_mds, mapping, aligned_samples = aligned_samples, color_aes = Origin, shape_aes = Origin, show_paths = show_paths, sample_point_alpha = 0.2) }
Basic MDS redoing the figure from the paper
set.seed(20190514) base_mds <- otu_tab %>% metaMDS(trymax = 400, maxit = 500, trace = 0) #base_mds$points plot_mds_dunthorn(base_mds, mapping)
Running the Dirichlet-multinomial bootstrap
#Here, group_column needs to be filled, it is needed for the connectivity metric (not directly using right now, but it is of interest). group_column should be the main factor in the data. sens <- mds_sensitivity_check(otu_tab, mapping, group_column = Origin, trace = 0) plot.mds_sensitivity(sens,color_aes = Origin, shape_aes = Origin, sample_point_alpha = 0.2) sens$connectivity_stats sens$consistency_stats
Additional analyses that are not needed now
sens_jackknife <- mds_sensitivity_check(20, otu_tab, mapping, group_column = Origin, trace = 0, trymax = 50, sampling_func = sample_posterior_jackknife_observations) plot(sens_jackknife,color_aes = Origin, shape_aes = Origin, sample_point_alpha = 0.2)
mds_samples_rarefy <- sample_posterior_rarefy(20, otu_tab) %>% metaMDS_per_sample(try = 200, maxit = 500) %>% purrr::map(vegan::procrustes, X = base_mds) plot_mds_dunthorn(base_mds, mapping, aligned_samples = mds_samples_rarefy)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.