user=Sys.info()[["user"]]
require(rmarkdown)
require(markdown)
library(ade4)
library(RColorBrewer)
library(BiotypeR)
library(gridExtra)
library(reshape2)
library(vegan)
library(knitr)
library(ggplot2)
library(scales)
library(dplyr)
#source("src/mclapply.hack.R")
opts_chunk$set(fig.width=12, fig.height=8, warning=FALSE, 
message=FALSE, echo=FALSE, dev=c('png', 'pdf', 'win.metafile', 'tiff'), 
fig.cap=NULL, cache=TRUE, dpi=300)

Abstract

Gut microbiota richness and stability seems to be an important parameter in host-microbe symbiosis. Diet might be one way to maintain or restore high richness in the gut microbiota notably using dietary fibers. However, the impact of diet on the stability and richness of human gut microbiota is not fully characterized. In this work, through a six week nutritional trial, 19 healthy adults consumed a basal diet supplemented with 10 or 40 g dietary fiber per day for five days, followed by 15-day washout periods. Fecal samples were collected at four time points and were analyzed by a combination of 16S rRNA gene targeted pyrosequencing, intestinal cell genotoxicity and short chain fatty analysis. Metatranscriptomic sequencing approach was also done for a subgroup of subjects. Whatever the taxonomic levels studied, changing dietary fiber amount did not have the same impact for all individuals but remained significant within each individual gut microbiota at genus, family and phylum levels. Higher gut microbiota richness is associated with higher gut microbiota stability to marked changes in dietary fiber intake which is also accompanied by the modulation of the expression of numerous gut microbiota metabolic pathways like glycan and methanogenesis metabolism. Higher microbial richness is also associated with higher proportions of Prevotella, Coprococcus and Dorea species and higher levels of caproate and valerate. Microbiota richness was not associated with intestinal cell genotoxicity and this was not modulated by diet. This study provides new insights on the impact of gut microbiota on dietary modulation of host-microbe symbiosis.

Study design

The aim of this study was to revaluate the diet change using different fibers intake on human health, using an integrated approach to characterize gut microbiome composition and activity. We undertook a randomized nutritional intervention in 19 healthy adults, with controlled normal diets providing 10 then 40g or 40 then 10g natural fibers per day. Four time points were collected per individual and random duplicates were added to the dataset. The microbiota composition was assessed by quantitative PCR and 454 pyrosequencing. The microbial activity was measured by metatranscriptomics. The bacterial fermentation was evaluated by short chain fatty acid analysis. In addition, the impact of the diet on host DNA damage was analysed using the standard assay, the Comet assay on HT29 cell line.


**Fig. S1.**

Fig. S1: Cross over nutritional intervention of this study. Each disc represents a collection point. The blue group was composed of nine individuals, who first had 40g fiber/day; the orange group of 10 individuals who started with 10g fiber/day. During the 15-day washout period, the individuals were on their usual personal diet without any constraint.


Read and Preprocessing data

Replicates sequencing analysis

library(AlimIntest)
data(sanger_454)
data(replicat_454)

sanger_454_meta             = unique(sanger_454[1:3])[order(unique(sanger_454[1:3])$samples),]
rownames(sanger_454_meta)   = 1:dim(sanger_454_meta)[1]
replicat_454_meta           = unique(replicat_454[1:2])[order(unique(replicat_454[1:2])$samples),]
rownames(replicat_454_meta) = 1:dim(replicat_454_meta)[1]


sanger_454$tax[which(sanger_454$conf < 0.8)]     <- "Unassigned"
replicat_454$tax[which(replicat_454$conf < 0.8)] <- "Unassigned"


replicat_454.freq = replicat_454 %>% 
                    group_by(samples,manufacturer,tax) %>% 
                    summarise (n = n()) %>% 
                    mutate(freq = n / sum(n))


sanger_454.freq   = sanger_454 %>% 
                    group_by(samples,technology,run,tax) %>% 
                    summarise (n = n()) %>% 
                    mutate(freq = n / sum(n))

sanger_454.cast   = dcast(sanger_454.freq, tax ~ samples, fill = 0)
sanger_454.melt   = melt(sanger_454.cast, id.vars=c("tax","A"))

replicat_454.cast = dcast(replicat_454.freq, tax + samples ~ manufacturer, fill=0 )

sanger_454.plot = ggplot(sanger_454.melt, aes(x=A+10^-5, y=value+10^-5, group=variable)) + 
                  geom_point(aes(color=variable), size=2, alpha=0.8) + 
                  scale_x_log10("Sanger method",limits=c(10^-5,1)) + 
                  scale_y_log10("454 GS FLX Ti method",limits=c(10^-5,1)) + 
                  theme_bw() + geom_abline(slope=1, linetype = 2) + 
                  scale_color_discrete("454 method run",labels=sanger_454_meta$run[-1] )



replicat_454.plot = ggplot(replicat_454.cast, aes(x=A, y=B)) + 
                    geom_point(aes(color=samples), size=2, alpha=0.8) +
                    scale_x_log10("454 GS FLX Ti Manufacturer A",limits=c(10^-5,1)) + 
                    scale_y_log10("454 GS FLX Ti Manufacturer B",limits=c(10^-5,1)) + 
                    theme_bw() + geom_abline(slope=1, linetype = 2)

grid.arrange(replicat_454.plot, sanger_454.plot, ncol=2)

Fig. S4: Technical replicates comparisons based on dominant genera relative log10 proportion. (Left panel) Eight replicates pyrosequenced by two different companies were compared and show high significant correlation based on genera distribution (rho=0.8, p<0.05, spearman method). (Right panel) A sample sequenced by both Sanger and pyrosequencing methods with different lane and MID configurations shows the same trend.


Whole 454 reads dataset processing

Sequences reads data were processed with Lotus with at least 200 nucleotides as quality filter to build OTU and 100 pb to remap reads against OTU representatives sequences. More information about Lotus documentation could be found here : http://psbweb05.psb.ugent.be/lotus/documentation.html


In order to have a overview, here is some samples of metadata used which include study design parameters like "Diet run" with also SCFA and quantitative PCR assays.

study design

data(alimintestData)
kable(format(head(alimintestData$metadata[1:6])))

SCFA assays

kable(format(head(alimintestData$metadata[7:14])))

qPCR and comet assays

kable(format(head(alimintestData$metadata[15:23])))

Numerical Ecology

Here we calculated the microbial richness (alpha-diversity) at the same sequencing depth for each sample (n = 1000 reads). Rarefaction curves analysis was also be done for each sample.


alimintestData$metadata$richness       = rep(NA,76)
alimintestData$metadata$richness = round(rarefy(t(alimintestData$otu), 1000))
richness_diet_run = ggplot(alimintestData$metadata, aes(x=Diet_run, y=richness)) + geom_boxplot(aes(fill=Diet_run)) + facet_grid(.~Time.point) + scale_fill_manual(values=c("orange","blue")) + theme_bw()
richness_diet_run

Fig. S2 (top panel). Gut microbiota richness and composition assessed by 454 and qPCR as function of diet over time. 454 pyrosequencing was used to assess the microbial richness and qPCR was used to quantify major microbial group overtime including the basal time. The major microbial group quantification was normalized using the all bacteria qPCR assay.


There is no significant changes of gut microbial richness overtime.


res  = round(rarefy(t(alimintestData$otu), seq(10,1000,10)))
colnames(res) = seq(10,1000,10)
res  = cbind(res,alimintestData$metadata[,1:5])
res2 = melt(res, id.vars=c(colnames(alimintestData$metadata[,1:5])))

res2$variable   = as.numeric(as.character(res2$variable))
res2$Time.point = paste('Time point',res2$Time.point, sep="")

ggplot(res2, aes(x=variable, y=value, group=Sample_ID)) +
 geom_line(aes(color=Diet_run)) + scale_color_manual("",values=c("orange","blue")) +
 facet_wrap(~ Time.point, ncol=2) + theme_bw() + xlab("Number of sequences sampled") + ylab("nb of OTUs")

Fig. S5: Rarefaction analysis of the pyrosequencing data by diet time point. Number of unique OTUs identified in the 76 samples relative to the sequencing depth. For time points and nutritional study scheme see Fig. S1.


Quantification of individuality and diet factor


a = alimintestData$metadata[c(3,5,15:21)]
b = alimintestData$basalqpcr[c(3,5,6:12)]

colnames(a) = colnames(b)

df = rbind(a,b)


qPCR_names=c("log_Cleptum_group","log_Bcoccoides_group","log_Bacteroides_Prevotella","log_Bifidobacteria","log_Ecoli","log_Lactobacillus")

for(i in qPCR_names) {

  df[,i] = df[,i] - df$log_all_Bacteria
}

colnames(df)[4:9] = c("Cleptum_group","Bcoccoides_group","Bacteroides_Prevotella","Bifidobacteria","Ecoli","Lactobacillus")

df = melt(df, id.vars=c("Diet_run","Time.point"))

qPCR_raw_boxplot=ggplot(df, id.vars=c("Time.point","Diet_run")) + geom_boxplot(aes(y=value,x=Diet_run, fill=Diet_run)) + facet_grid(variable~Time.point, scales="free") + scale_fill_manual(values=c("orange","blue")) + theme_bw()

qPCR_raw_boxplot

Fig. S2 (bottom panel). Gut microbiota richness and composition assessed by 454 and qPCR as function of diet over time.


grid.arrange(richness_diet_run + theme(legend.position = "none") + xlab(""), qPCR_raw_boxplot + xlab("") + theme(legend.position = "none") + ylab("qPCR assays"), heights=c(2/8, 6/8))

A between class analysis is done to quantify individual and diet effect on global variation. Microbiota summaries were done after normalisation at phylum, class, order, family, genus and OTU levels using 454 pyrosequencing sequencing. Inertia attributed to diet and host was then challenged against Monte Carlo simulated dataset.

otu_norm = prop.table(as.matrix(alimintestData$otu),2)

microbiota_tax = list(
    phylum_tax = apply(otu_norm, 2, tapply, alimintestData$tax$phylum, sum ),
    class_tax  = apply(otu_norm, 2, tapply, alimintestData$tax$class, sum ),
    order_tax  = apply(otu_norm, 2, tapply, alimintestData$tax$order, sum ),
    family_tax = apply(otu_norm, 2, tapply, alimintestData$tax$family, sum ),
    genus_tax  = apply(otu_norm, 2, tapply, alimintestData$tax$genus, sum ),
    otu_norm   = noise.removal(otu_norm, percent=1)
)

metadata76            = alimintestData$metadata
metadata76$Time.point = as.factor(metadata76$Time.point)

microbiota_tax_variations=alimintest_metadata_pca=sapply(microbiota_tax, function(x) {

x                  = as.data.frame(t(log10(x + 10^-5)))
x.pca              = dudi.pca(x,    scannf=FALSE, nf=2)
x.bca.ind          = bca(x.pca,     scannf=FALSE, nf=2, fac=as.factor(metadata76$Subject_Id))
x.wca.ind          = bca(x.pca,     scannf=FALSE, nf=2, fac=as.factor(metadata76$Subject_Id))
x.bca.diet         = bca(x.pca,     scannf=FALSE, nf=2, fac=as.factor(metadata76$Diet_run:metadata76$Time.point))
#x.wca.ind.bca.diet = bca(x.wca.ind, scannf=FALSE, nf=2, fac=as.factor(metadata76$Diet_run:metadata76$Time.point))
x.wca.ind.bca.diet = bca(wca(dudi.pca(x, scannf=F, nf=2), scannf=F, nf=2, fac=as.factor(metadata76$Subject_Id)), scannf=F, nf=2, fac=as.factor(metadata76$Diet_run:metadata76$Time.point) )

x.bca.ind.res          = randtest(x.bca.ind) 
x.bca.diet.res         = randtest(x.bca.diet) 
x.wca.ind.bca.diet.res = randtest(x.wca.ind.bca.diet)

res = data.frame(ind.obs        = x.bca.ind.res$obs*100,
                 ind.sim        = quantile(x.bca.ind.res$sim, 0.95)[[1]]*100,
                 diet.obs       = x.bca.diet.res$obs*100,
                 diet.sim       = quantile(x.bca.diet.res$sim,0.95)[[1]]*100,
                 diet.w.ind.obs = x.wca.ind.bca.diet.res$obs*100,
                 diet.w.ind.sim = quantile(x.wca.ind.bca.diet.res$sim, 0.95)[[1]]*100
                 )
return(res)
}
)

m           = data.frame(matrix(unlist(microbiota_tax_variations), nrow = nrow(microbiota_tax_variations), byrow=F))
colnames(m) = colnames(microbiota_tax_variations)
rownames(m) = rownames(microbiota_tax_variations)

microbiota_tax_variations = m

m          = melt(data.frame(microbiota_tax_variations, id=rownames(microbiota_tax_variations)), id="id", value.name = "Inertia")
m$DataType = rep(c("observed","MC simulated\n(95th percentile)"),18)
m$Test     = rep(rep(c("Between Subject","Between Diet","Between Diet\nwithin Subject"),each=2),6)


ggplot(m, aes(x=variable, y=Inertia, shape=Test, col=DataType)) + geom_point(size=3, alpha=0.7) + xlab("") + ylab("Inertia attributed (%)") + scale_x_discrete(label=c("phylum","class","order","family","genus","OTU")) + theme_bw()

Fig. 1: Gut microbiota inertia variations according to diet and subjects at different taxonomic levels. Between class analysis was used to measure the inertia using diet and individual alternatively as instrumental variable. Monte Carlo (MC) simulated inertia (n=1000 simulation) was used to assess the significance of inertia attributed to different factors. When observed inertia is higher to the 95th percentile of MC simulated inertia then it is considered as significant.


When the observed inertia is above the 95th percentile of Monte Carlo (MC) simulated inertia, the effect was considered as significant.

Richness and comet vs Microbial changes

We tested if microbial changes over-time is associated with microbial richness and genotoxicity (comet assays). We stratified observations into groups corresponding to samples taking after 10g fiber diet, 40g fiber diet and usual diet (corresponding to the washout period). Microbial changes was evaluated using the JSD distance metrics.

genus.jsd = melt(as.matrix(BiotypeR::dist.JSD(microbiota_tax$genus_tax)))

microbial.change=NULL

metadata76$Subject_Id = as.factor(as.character(metadata76$Subject_Id))


for (s in levels(metadata76$Subject_Id)) {

        t2 = metadata76$Sample_ID[which(metadata76$Subject_Id==s & metadata76$Time.point=="2")]
        t3 = metadata76$Sample_ID[which(metadata76$Subject_Id==s & metadata76$Time.point=="4")]
        t4 = metadata76$Sample_ID[which(metadata76$Subject_Id==s & metadata76$Time.point=="3")]
        t5 = metadata76$Sample_ID[which(metadata76$Subject_Id==s & metadata76$Time.point=="5")]


        jsd=c(
        genus.jsd[which(genus.jsd[,1]==t2 & genus.jsd[,2]==t3),3],
        genus.jsd[which(genus.jsd[,1]==t3 & genus.jsd[,2]==t4),3],
        genus.jsd[which(genus.jsd[,1]==t4 & genus.jsd[,2]==t5),3]
        )

        richness.before=c(metadata76$richness[metadata76$Sample_ID==t2],
        metadata76$richness[metadata76$Sample_ID==t3],
        metadata76$richness[metadata76$Sample_ID==t4])

        richness.after=c(
        metadata76$richness[metadata76$Sample_ID==t3],
        metadata76$richness[metadata76$Sample_ID==t4],
        metadata76$richness[metadata76$Sample_ID==t5]
        )

        comet.after=c(
        metadata76$Comet_assay[metadata76$Sample_ID==t3],
        metadata76$Comet_assay[metadata76$Sample_ID==t4],
        metadata76$Comet_assay[metadata76$Sample_ID==t5]
        )


        diet=metadata76$Diet_run[which(metadata76$Subject_Id==s)][1]

        tmp=data.frame(Subject_ID=rep(s,3),change=c("2_3","3_4","4_5"),diet=rep(diet,3),jsd=jsd, richness.before, richness.after, comet.after)

        microbial.change=rbind(microbial.change,tmp)

}

METHOD="spearman"

cor.test(log10(microbial.change$jsd), microbial.change$richness.before, method=METHOD)
cor.test(log10(microbial.change$jsd), microbial.change$comet.after, method=METHOD)

There was a significant link between microbial change and the estimated microbial richness before. In others words, microbial richness was predictive of microbial change over time. There is no significant association between comet assays and microbial richness. Microbial richness seems independent of the microbial change during the 10g fiber/day diet.


group         = as.factor(microbial.change$change:microbial.change$diet)
levels(group) = c("after 10g fiber/day","after 40g fiber/day", "usual diet (washout)", 
                  "usual diet (washout)", "after 40g fiber/day","after 10g fiber/day")

richness_vs_jsd = ggplot(data.frame(microbial.change,group), aes(x=log10(jsd), y = richness.before)) +
                  geom_point(aes(colour = diet), cex = 4) + stat_smooth(method = "lm") + 
                  scale_colour_manual(values = c("orange","blue")) +
                  facet_grid(.~group) + 
                  xlab("microbiota change\n(estimated by log10 JSD distance)") +
                  ylab("Microbial richness") +
                  theme_bw()

richness_vs_jsd

Fig. 2: Gut microbiota change over time associated with OTU richness. Microbial change within subject was assessed using the JSD distance metrics after 10g fiber per day, 40g fiber per day and the usual diet washout period. Microbial richness account for the number of OTUs rarefied at the same levels of sequences for each sample before the diet. The two diet runs, 10 to 40g and 40 to 10g, were pictured by red and blue dots respectively. Spearman correlation showed significant association between richness before diet and microbiota change after 40g fiber per day.


Microbiota variations measured by JSD metrics is not associated with diet change

pairwise.wilcox.test(data.frame(microbial.change,group)$jsd,group,
 p.adjust="none", paired=TRUE)
pairwise.wilcox.test(data.frame(microbial.change,group)$jsd,data.frame(microbial.change,group)$change,
 p.adjust="none", paired=TRUE)
ggplot(data.frame(microbial.change,group), aes(x=log10(jsd), y=comet.after)) +
  geom_point(aes(colour=diet), cex=4) + stat_smooth(method = "lm") +
  xlab("microbiota change\n(estimated by log10 JSD distance)")  +
  ylab("comet assays") + theme_bw() + facet_grid(.~group)
x1 = cor.test(log10(microbial.change$jsd[which(group=="after 10g fiber/day")]),  
microbial.change$richness.before[which(group=="after 10g fiber/day")], method=METHOD)
y1 = cor.test(log10(microbial.change$jsd[which(group=="after 40g fiber/day")]),  
microbial.change$richness.before[which(group=="after 40g fiber/day")], method=METHOD)
z1 = cor.test(log10(microbial.change$jsd[which(group=="usual diet (washout)")]),  
microbial.change$richness.before[which(group=="usual diet (washout)")], method=METHOD)

x2 = cor.test(log10(microbial.change$jsd[which(group=="after 10g fiber/day")]),  
microbial.change$comet.after[which(group=="after 10g fiber/day")], method=METHOD)
y2 = cor.test(log10(microbial.change$jsd[which(group=="after 40g fiber/day")]),  
microbial.change$comet.after[which(group=="after 40g fiber/day")], method=METHOD)
z2 = cor.test(log10(microbial.change$jsd[which(group=="usual diet (washout)")]),  
microbial.change$comet.after[which(group=="usual diet (washout)")], method=METHOD)

x1 = paste(round(x1$estimate[[1]],2), " (",round(x1$p.value,3),")", sep="")
y1 = paste(round(y1$estimate[[1]],2), " (",round(y1$p.value,3),")", sep="")
z1 = paste(round(z1$estimate[[1]],2), " (",round(z1$p.value,3),")", sep="")

x2 = paste(round(x2$estimate[[1]],2), " (",round(x2$p.value,3),")", sep="")
y2 = paste(round(y2$estimate[[1]],2), " (",round(y2$p.value,3),")", sep="")
z2 = paste(round(z2$estimate[[1]],2), " (",round(z2$p.value,3),")", sep="")

res = t(matrix(c(x1,x2,y1,y2,z1,z2), ncol=3))

colnames(res) = c("Richness before diet","comet assays after diet")
rownames(res) = c("10g","40g","usual")

res

Associations between Microbial change and richness before diet and comet assays after diet. Associations were assessed by Spearman rho correlation and p.value were indicated in parenthesis.

kable(format(res))

Gut microbiota richness and SCFA

Spearman rho correlation between microbiota richness, bacterial genera proportions and SCFAs

cor(metadata76$richness[-c(50,70)], metadata76[-c(50,70),7:14], method="spearman")

cor(metadata76$richness,as.data.frame(t(log10((microbiota_tax$genus_tax[-1,]) + 10^-5))), method="spearman")

Using univariate statistics, correlation are quite low and non significant. Coinertia analysis allow us to to test the multivariate associations between SCFAs profile and microbiota genera.

genus.scfa.coi = coinertia(
                 dudi.pca(na.omit(metadata76[7:14]), scannf=F, nf=7), 
                 dudi.pca(as.data.frame(t(log10(noise.removal(microbiota_tax$genus_tax, percent=1)[,-attr(na.omit(metadata76[7:14]),"na.action")] + 10^-5))), scannf=F, nf=7),
                 scannf=F, nf=7
        )

test=randtest(genus.scfa.coi,9999)


dd=data.frame(genus.scfa.coi$lX, metadata76[-c(50,70),], as.data.frame(t(log10(noise.removal(microbiota_tax$genus_tax, percent=1)[,-attr(na.omit(metadata76[7:14]),"na.action")] + 10^-5))))

SCFA and microbiota composition at genus levels shared a co-structured inertia (RV = r paste(round(test$obs*100,2), " %"), p = r round(test$pvalue,3)) supported by a significant Monte Carlo test.


scfa.richness.cor = data.frame(genus.scfa.coi$co[1:2], richness.cor=t(cor(metadata76$richness[-c(50,70)], metadata76[-c(50,70),7:14], method="spearman")))

genus.richness.cor = data.frame(genus.scfa.coi$li[-1,1:2],

richness.cor=cor(metadata76$richness,as.data.frame(t(log10((microbiota_tax$genus_tax[-1,]) + 10^-5))), method="spearman")[,row.names(genus.scfa.coi$li[-1,1:2])])

colnames(genus.richness.cor) = colnames(scfa.richness.cor)

genus.scfa.richness.cor = rbind(scfa.richness.cor, genus.richness.cor)

p3 = ggplot(genus.scfa.richness.cor) + geom_text(aes(x=Comp1, y=Comp2, label=row.names(genus.scfa.richness.cor), col=richness.cor)) + xlab("PC1 loadings") + ylab("PC2 loadings")

p3 + scale_colour_gradient("Spearman correlation\nwith microbiota richness", limits=c(-0.6, 0.6), low="red", high="green", space="Lab")

Fig. S7: Global association between microbial OTU richness, SCFA and bacterial genera using co-inertia analysis. The two first principal component loading from the co-inertia analysis between SCFA and microbial genera are plotted. Color gradient account for spearman correlation strengh between SCFA and bacterial genera with microbial OTU richness.


From the co-inertia analysis, we observed that i) SCFAs profile are significantly associated with bacteria genera proportions ii) and this associations is correlated with bacterial richness. For example, caproate, valerate, Prevotella, Coprococcus are positively associated with richness while butyrate, propionate, acetate and Bacteroides are negatively associated with richness. We choose those observations to build the main figure below. The first component of the co-inertia analysis (PC1) represent well those associations.


p1 = ggplot(dd, aes(x = AxcX1, y = richness)) +
     geom_point(aes(size = richness, colour = caproate+valerate), alpha = 0.8) +
     scale_colour_gradient("Caproate and\nValerate (mM)",low = "blue", high = "red", space = "Lab") +
     xlab("PC1") + ylab("Microbial richness") #+ xlim( -4,3) 

p2 = ggplot(dd, aes(x = AxcX1, y = richness)) +
     geom_point(aes(size = richness, colour = Prevotella-Bacteroides), alpha = 0.8) +
     scale_colour_gradient("log10 of\nPrevotella/\nBacteroides\nratio",low = "blue", high = "red", space = "Lab") +
     xlab("PC1") + ylab("Microbial richness") #+ xlim( -4,3) 

grid.arrange(p1+theme_bw(),p2+theme_bw(), ncol=2)

Fig. 3: Association between microbial OTU richness, SCFA and bacterial genera. The first principal component (PC1) from the co-inertia analysis between SCFA and microbial genera was associated significantly with microbial richness. Circle size account for OTU richness and color gradient account for caproate and valerate amount (left panel) and Prevotella/Bacteroides ratio (right panel)


Conclusions

This study provided the following information:

We conclude that gut bacterial richness have to be taken in account for future nutritional intervention.



tapj/AlimIntest documentation built on May 31, 2019, 3:48 a.m.