mbecPN: Percentile Normalization (PN)

Description Usage Arguments Details Value

View source: R/mbecs_corrections.R


This method was actually developed specifically to facilitate the integration of microbiome data from different studies/experimental set-ups. This problem is similar to the mitigation of BEs, i.e., when collectively analyzing two or more data-sets, every study is effectively a batch on its own (not withstanding the probable BEs within studies). The algorithm iterates over the unique batches and converts the relative abundance of control samples into their percentiles. The relative abundance of case-samples within the respective batches is then transformed into percentiles of the associated control-distribution. Basically, the procedure assumes that the control-group is unaffected by any effect of interest, e.g., treatment or sickness, but both groups within a batch are affected by that BE. The switch to percentiles (kinda) flattens the effective difference in count values due to batch - as compared to the other batches. This also introduces the two limiting aspects in percentile normalization. It can only be applied to case/control designs because it requires a reference group. In addition, the transformation into percentiles removes information from the data.


mbecPN(input.obj, model.vars, type = "tss")



phyloseq object or numeric matrix (correct orientation is handeled internally)


Vector of covariate names. First element relates to batch.


Which abundance matrix to use, one of 'otu, tss, clr'. DEFAULT is 'tss'.


The input for this function is supposed to be an MbecData object that contains total sum-scaled and cumulative log-ratio transformed abundance matrices. Output will be a matrix of corrected abundances.


A matrix of batch-effect corrected counts

buschlab/MBECS documentation built on Jan. 21, 2022, 1:27 a.m.