R/mbmixture User Guide
In mbmixture: Microbiome Mixture Analysis

library(knitr)
opts_chunk$set(fig.width=7, fig.height=4.5,
               dev.args=list(pointsize=16))
scipen <- getOption("scipen")
options(scipen=10)

R/mbmixture is an R package for evaluating whether a microbiome sample is the mixture of two source samples. We are thinking of shotgun sequencing data on the microbiome sample plus dense SNP genotype data on the two potential source samples. We assume that the data has been reduced to a three-dimensional array of read counts: the 3 possible SNP genotypes for the first sample × the 3 possible SNP genotypes of the second sample × the 2 possible SNP alleles on the reads.

The data are very specific, and it fits a very specific model.

It comes with a single example data set, mbmixdata. Let's load the package and the data.

library(mbmixture)
data(mbmixdata)

The data set is a three-dimensional array, 3×3×2, with read counts at SNPs broken down by the SNP genotype for the first sample, the SNP genotype for the second sample, and the SNP allele in the read. It seeks to fit a multinomial model with two parameters: p is the "contaminant probability" (the proportion of the microbiome reads coming from the second sample), and e is the sequencing error rate.

Here's a view of the data:

mbmixdata

The function mbmix_loglik is not generally called by the user, but calculates the log likelihood for particular values of p and e.

mbmix_loglik(mbmixdata, p=0.74, e=0.002)

The function mle_pe is the key one. It obtains the MLEs of p and e.

(mle <- mle_pe(mbmixdata))

So for these data, the estimate of p is r round(mle["p"], 2), meaning that r round(mle["p"]*100)% of the microbiome sample came from the second sample (and r round(100-mle["p"]*100)% came from the first sample). The estimate of the sequencing error rate is r round(mle["e"], 3).

The mle_pe() function also returns likelihood ratio test statistics (twice the natural log of the likelihood ratio) for tests of p=0 and p=1. Here, they're both extremely large.

If you use the argument SE=TRUE, you'll get estimated standard errors as an attribute.

(mle_w_SE <- mle_pe(mbmixdata, SE=TRUE))

You can grab the SEs using the attr() function.

attr(mle_w_SE, "SE")

There's also a function to derive estimated standard errors via a parametric bootstrap.

bootstrap_se <- bootstrapSE(mbmixdata, 1000)

c(p = 0.000351360951177482, err = 0.0000272304773558442)

There are also functions to calculate the MLE of p for a fixed value for e, and vice versa. They return the estimate, with the log likelihood as an attribute.

mle_p(mbmixdata, e=0.002)
mle_e(mbmixdata, p=0.74)

options(scipen=scipen)

Any scripts or data that you put into this service are public.

mbmixture documentation built on April 4, 2025, 1:44 a.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

mbmixture
Microbiome Mixture Analysis

R/mbmixture User Guide
In mbmixture: Microbiome Mixture Analysis

Try the mbmixture package in your browser

R Package Documentation

Browse R Packages

We want your feedback!

mbmixture Microbiome Mixture Analysis

R/mbmixture User Guide In mbmixture: Microbiome Mixture Analysis

Try the mbmixture package in your browser

R Package Documentation

Browse R Packages

We want your feedback!

mbmixture
Microbiome Mixture Analysis

R/mbmixture User Guide
In mbmixture: Microbiome Mixture Analysis