adjust_batch generates biomarker levels for the variable(s)
markers in the dataset
data that are corrected
(adjusted) for batch effects, i.e. differential measurement
error between levels of
1 2 3 4 5 6 7 8 9 10 11
Variable name(s) to batch-adjust. Select
multiple variables with tidy evaluation, e.g.,
Categorical variable indicating batch.
Method for batch effect correction:
Optional: Confounders, i.e. determinants of
biomarker levels that differ between batches. Only used if
Optional: What string to append to variable names
after batch adjustment. Defaults to
Optional and used for
Optional and used for
Optional and used for
If no true differences between batches are expected, because
samples have been randomized to batches, then a
that returns adjusted values with equal means
method = simple) or with equal rank values
method = quantnorm) for all batches is appropriate.
If the distribution of determinants of biomarker values
confounders) differs between batches, then a
method that retains these "true" differences
between batches while adjusting for batch effects
may be appropriate:
method = standardize and
method = ipw address means;
method = quantreg
addresses lower values and dynamic range separately.
method to choose depends on the properties of
batch effects (affecting means or also variance?) and
the presence and strength of confounding. For the two
mean-only confounder-adjusted methods, the choice may depend
on whether the confounder–batch association (
method = ipw)
or the confounder–biomarker association
method = standardize) can be modeled better.
Generally, if batch effects are present, any adjustment
method tends to perform better than no adjustment in
reducing bias and increasing between-study reproducibility.
All adjustment approaches except
method = quantnorm
are based on linear models. It is recommended that variables
confounders first be transformed
as necessary (e.g.,
log transformations or
splines). Scaling or mean centering are not necessary,
and adjusted values are returned on the original scale.
support tidy evaluation.
Observations with missing values for the
confounders will be ignored in the estimation of adjustment
parameters, as are empty batches. Batch effect-adjusted values
for observations with existing marker values but missing
confounders are based on adjustment parameters derived from the
other observations in a batch with non-missing confounders.
data dataset with batch effect-adjusted
variable(s) added at the end. Model diagnostics, using
.batchtma of this dataset, are available
Konrad H. Stopsack
Stopsack KH, Tyekucheva S, Wang M, Gerke TA, Vaselkiv JB, Penney KL, Kantoff PW, Finn SP, Fiorentino M, Loda M, Lotan TL, Parmigiani G+, Mucci LA+ (+ equal contribution). Extent, impact, and mitigation of batch effects in tumor biomarker studies using tissue microarrays. bioRxiv 2021.06.29.450369; doi: https://doi.org/10.1101/2021.06.29.450369 (This R package, all methods descriptions, and further recommendations.)
Rosner B, Cook N, Portman R, Daniels S, Falkner B.
Determination of blood pressure percentiles in
normal-weight children: some methodological issues.
Am J Epidemiol 2008;167(6):653-66. (Basis for
method = standardize)
Bolstad BM, Irizarry RA, Åstrand M, Speed TP.
A comparison of normalization methods for high density
oligonucleotide array data based on variance and bias.
Bioinformatics 2003;19:185–193. (
method = quantnorm)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
# Data frame with two batches # Batch 2 has higher values of biomarker and confounder df <- data.frame( tma = rep(1:2, times = 10), biomarker = rep(1:2, times = 10) + runif(max = 5, n = 20), confounder = rep(0:1, times = 10) + runif(max = 10, n = 20) ) # Adjust for batch effects # Using simple means, ignoring the confounder: adjust_batch( data = df, markers = biomarker, batch = tma, method = simple ) # Returns data set with new variable "biomarker_adj2" # Use quantile regression, include the confounder, # change suffix of returned variable: adjust_batch( data = df, markers = biomarker, batch = tma, method = quantreg, confounders = confounder, suffix = "_batchadjusted" ) # Returns data set with new variable "biomarker_batchadjusted"
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.