ConQuR: Remove batch effects from a taxa read count table

View source: R/ConQuR_main_tune.R

ConQuRR Documentation

Remove batch effects from a taxa read count table

Description

Remove batch effects from a taxa read count table

Usage

ConQuR(
  tax_tab,
  batchid,
  covariates,
  batch_ref,
  logistic_lasso = F,
  quantile_type = "standard",
  simple_match = F,
  lambda_quantile = "2p/n",
  interplt = F,
  delta = 0.4999,
  taus = seq(0.005, 0.995, by = 0.005),
  num_core = 2
)

Arguments

tax_tab

The taxa read count table, samples (row) by taxa (col).

batchid

The batch indicator, must be a factor.

covariates

The data.frame contains the key variable of interest and other covariates, e.g., data.frame(key, x1, x2).

batch_ref

A character, the name of the reference batch, e.g.,“2”.

logistic_lasso

A logical value, TRUE for L1-penalized logistic regression, FALSE for standard logistic regression; default is FALSE.

quantile_type

A character, “standard” for standard quantile regression, “lasso” for L1-penalized quantile regression, “composite” for composite quantile regression; default is “standard”.

simple_match

A logical value, TRUE for using the simple quantile-quantile matching, FALSE for not; default is FALSE.

lambda_quantile

A character, the penalization parameter in quantile regression if quantile_type=“lasso” or “composite”; only two choices “2p/n” or “2p/logn”, where p is the number of expanded covariates and n is the number of non-zero read count; default is “2p/n”.

interplt

A logical value, TRUE for using the data-driven linear interpolation between zero and non-zero quantiles to stablize border estimates, FALSE for not; default is FALSE.

delta

A real constant in (0, 0.5), determing the size of the interpolation window if interplt=TRUE, a larger delta leads to a narrower interpolation window; default is 0.4999.

taus

A sequence of quantile levels, determing the “precision” of estimating conditional quantile functions; default is seq(0.005, 0.995, by=0.005).

num_core

A real constant, the number of cores used for computing; default is 2.

Details

  • Choose batch_ref based on prior knowledge, or try several options, there is no default.

  • The option “composite” of quantile_type is aggressive, use with caution.

  • If choose simple_match=TRUE, logistic_lasso, quantile_type, lambda_quantile, interplt and delta won't take effect.

  • Always use a fine grid of taus if the size of data is adequate.

Value

The corrected taxa read count table, samples (row) by taxa (col).

References

  • Ling, W. et al. (2021+). ConQuR: batch effects removal for microbiome data in large-scale epidemiology studies via conditional quantile regression.

  • Ling, W. et al. (2020+). Statistical inference in quantile regression for zero-inflated outcomes. Statistica Sinica.

  • Machado, J.A.F., Silva, J.S. (2005). Quantiles for counts. Journal of the American Statistical Association 100(472), 1226–1237.

  • Koenker, R. & Bassett Jr, G. (1978). Regression quantiles. Econometrica: journal of the Econometric Society, 33-50.

  • Koenker, R. (2005). Econometric Society Monographs: Quantile Regression. New York: Cambridge University.

  • Zou, H. & Yuan, M. (2008). Composite quantile regression and the oracle model selection theory. The Annals of Statistics 36, 1108-1126.


wdl2459/ConQuR documentation built on Aug. 28, 2022, 6:08 a.m.