adjust_batch: Zero-inflated empirical Bayes adjustment of batch effect in...

View source: R/adjust_batch.R

adjust_batchR Documentation

Zero-inflated empirical Bayes adjustment of batch effect in compositional feature abundance data

Description

adjust_batch takes as input a feature-by-sample matrix of microbial abundances, and performs batch effect adjustment given provided batch and optional covariate variables. It returns the batch-adjusted abundance matrix. Additional options and parameters can be passed through the control parameter as a list (see details).

Usage

adjust_batch(feature_abd, batch, covariates = NULL, data, control)

Arguments

feature_abd

feature-by-sample matrix of abundances (proportions or counts).

batch

name of the batch variable. This variable in data should be a factor variable and will be converted to so with a warning if otherwise.

covariates

name(s) of covariates to adjust for in the batch correction model.

data

data frame of metadata, columns must include batch and covariates (if specified).

control

a named list of additional control parameters. See details.

Details

control should be provided as a named list of the following components (can be a subset).

zero_inflation

logical. Indicates whether or not a zero-inflated model should be run. Default to TRUE (zero-inflated model). If set to FALSE then the correction will be similar to ComBat as provided in the sva package.

pseudo_count

numeric. Pseudo count to add feature_abd before the methods' log transformation. Default to NULL, in which case adjust_batch will set the pseudo count automatically to half of minimal non-zero values in feature_abd.

diagnostic_plot

character. Name for the generated diagnostic figure file. Default to "adjust_batch_diagnostic.pdf". Can be set to NULL in which case no output will be generated.

conv

numeric. Convergence threshold for the method's iterative algorithm for shrinking batch effect parameters. Default to 1e-4.

maxit

integer. Maximum number of iterations allowed for the method's iterative algorithm. Default to 1000.

verbose

logical. Indicates whether or not verbose information will be printed.

Value

a list, with the following components:

feature_abd_adj

feature-by-sample matrix of batch-adjusted abundances, normalized to the same per-sample total abundance as feature_abd.

control

list of additional control parameters used in the function call.

Author(s)

Siyuan Ma, siyuanma@g.harvard.edu

Examples

data("CRC_abd", "CRC_meta")
CRC_abd_adj <- adjust_batch(feature_abd = CRC_abd, 
                            batch = "studyID", 
                            covariates = "study_condition",
                            data = CRC_meta)$feature_abd_adj

biobakery/MMUPHin documentation built on March 30, 2024, 4:50 a.m.