cb.correct.matching_cComBat: Matching Conditional ComBat

View source: R/causal_ccombat.R

cb.correct.matching_cComBatR Documentation

Matching Conditional ComBat

Description

A function for implementing the matching conditional ComBat (matching cComBat) algorithm. This algorithm allows users to remove batch effects (in each dimension), while adjusting for known confounding variables. It is imperative that this function is used in conjunction with domain expertise (e.g., to ensure that the covariates are not colliders, and that the system could be argued to satisfy the ignorability condition) to derive causal conclusions. See citation for more details as to the conditions under which conclusions derived are causal.

Usage

cb.correct.matching_cComBat(
  Ys,
  Ts,
  Xs,
  match.form,
  covar.out.form = NULL,
  prop.form = NULL,
  reference = NULL,
  match.args = list(method = "nearest", exact = NULL, replace = FALSE, caliper = 0.1),
  retain.ratio = 0.05,
  apply.oos = FALSE
)

Arguments

Ys

an [n, d] matrix, for the outcome variables with n samples in d dimensions.

Ts

[n] the labels of the samples, with K < n levels, as a factor variable.

Xs

[n, r] the r covariates/confounding variables, for each of the n samples, as a data frame with named columns.

match.form

A formula of columns from Xs, to be passed directly to matchit for subsequent matching. See formula argument from matchit for details.

covar.out.form

A covariate model, given as a formula. Applies for the outcome regression step of the ComBat algorithm. Defaults to NULL, which re-uses match.form for the covariate/outcome model.

prop.form

A propensity model, given as a formula. Applies for the estimation of propensities for the propensity trimming step. Defaults to NULL, which re-uses match.form for the covariate/outcome model.

reference

the name of the reference/control batch, against which to match. Defaults to NULL, which treats the reference batch as the smallest batch.

match.args

A named list arguments for the matchit function, to be used to specify specific matching strategies, where the list names are arguments and the corresponding values the value to be passed to matchit. Defaults to inexact nearest-neighbor caliper (width 0.1) matching without replacement.

retain.ratio

If the number of samples retained is less than retain.ratio*n, throws a warning. Defaults to 0.05.

apply.oos

A boolean that indicates whether or not to apply the learned batch effect correction to non-matched samples that are still within a region of covariate support. Defaults to FALSE.

Value

a list, containing the following:

  • Ys.corrected an [m, d] matrix, for the m retained samples in d dimensions, after correction.

  • Ts [m] the labels of the m retained samples, with K < n levels.

  • Xs the r covariates/confounding variables for each of the m retained samples.

  • Model the fit batch effect correction model. See ComBat for details.

  • InSample.Ids the ids which were used to fit the batch effect correction model.

  • Corrected.Ids the ids to which batch effect correction was applied. Differs from InSample.Ids if apply.oos is TRUE.

Details

For more details see the help vignette: vignette("causal_ccombat", package = "causalBatch")

Author(s)

Eric W. Bridgeford

References

Eric W. Bridgeford, et al. "A Causal Perspective for Batch Effects: When is no answer better than a wrong answer?" Biorxiv (2024).

Daniel E. Ho, et al. "MatchIt: Nonparametric Preprocessing for Parametric Causal Inference" JSS (2011).

W Evan Johnson, et al. "Adjusting batch effects in microarray expression data using empirical Bayes methods" Biostatistics (2007).

Leek JT, Johnson WE, Parker HS, Fertig EJ, Jaffe AE, Zhang Y, Storey JD, Torres LC (2024). sva: Surrogate Variable Analysis. R package version 3.52.0.

Examples

library(causalBatch)
sim <- cb.sims.sim_linear(a=-1, n=100, err=1/8, unbalancedness=2)
cb.correct.matching_cComBat(sim$Ys, sim$Ts, data.frame(Covar=sim$Xs), "Covar")


causalBatch documentation built on April 3, 2025, 8:38 p.m.