BEEM is an approach to infer models for microbial community dynamics based on metagenomic sequencing data (16S or shotgun-metagenomics). It is based on the commonly used generalized Lotka-Volterra modelling (gLVM) framework. BEEM uses an iterative EM algorithm to simultaneously infer scaling factors (microbial biomass) and model parameters (microbial growth rate and interaction terms) from longitudinal data and can thus work directly with the relative abundance values that are obtained with metagenomic sequencing.
Note: BEEM stands for Biomass Estimation and model inference with an Expectation Maximization algorithm. We have now extended the BEEM framework to be able to work with cross-sectional data (BEEM-static, check out our R package here).
BEEM was written in R (>=3.3.1) and requires the following packages: - foreach - doMC: this currently only works on MacOS or LinuxOS - lokern - pspline - monomvn
You can install BEEM as an R package using devtools
devtools::install_github('csb5/beem')
The input files for BEEM should have the same format as described in the manual for MDSINE. The following two files are required by BEEM:
This should be a tab-delimited text file whose first row has the sample IDs and the first column has the OTU IDs (or taxonomic annotations). Each row should then contain the relative abundance of one OTU across all samples and each column should contain the relative abundances of all OTUs in that sample.
The metadata file should be a tab-delimited text file with the following columns:
sampleID isIncluded subjectID measurementID
sampleID
: sample IDs matching the first row of the OTU tableisIncluded
: whether the sample should be included in the analysis (1-include, 0-exclude)subjectID
: indicator for which biological replicate the sample belongs tomeasurementID
: time in standardized units from the start of the experimentWe have provided several sample input files that were also analyzed in our manuscript.
vignettes/props_et_al_analysis/counts.sel.txt
vignettes/props_et_al_analysis/metadata.sel.txt
vignettes/gibbons_et_al_analysis/{DA,DB,M3,F4}.counts.txt
vignettes/gibbons_et_al_analysis/{DA,DB,M3,F4}.metadata.txt
## Load functions
library(beem)
## Read inputs
counts <- read.table('counts.txt', head=F, row.names=1)
metadata <- read.table('metadata.txt', head=T)
## Run BEEM
res <- EM(dat=input, meta=metadata)
## Estimate parameters
biomass <- biomassFromEM(res)
write.table(biomass, 'biomass.txt', col.names=F, row.names=F, quote=F)
gLVparameters <- paramFromEM(res, counts, metadata)
write.table(gLVparameters, 'gLVparameters.txt', col.names=T, row.names=F, sep='\t' , quote=F)
BEEM estimated parameters is an R data.frame
(a table) with the following columns in order:
parameter_type
: growth_rate
or interaction
source_taxon
: source taxon for interaction (NA
if parameter_type
is growth_rate
)target_taxon
: target taxon for interaction or growth ratevalue
: parameter value significance
: confidence level of the inferred interaction (only meaningful for interactions)The commands for reproducing the analysis reportd in the manuscript are presented as jupyter notebooks: (1) notebook on a demo of the gLVM simulation, (2) notebook for Props et. al. and (3) notebook for Gibbons et. al..
C Li, K R Chng, J S Kwah, T V Av-Shalom, L Tucker-Kellogg & N Nagarajan. (2019). An expectation-maximization algorithm enables accurate ecological modeling using longitudinal metagenome sequencing data. Microbiome.
Please direct any questions or feedback to Chenhao Li (cli40@mgh.harvard.edu) and Niranjan Nagarajan (nagarajann@gis.a-star.edu.sg).
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.