geo_batch_correction: geo_batch_correction

View source: R/geo_batch_correction.R

geo_batch_correctionR Documentation

geo_batch_correction

Description

Helper function correcting for batch effects and mapping affy probes to Entrez IDs

Usage

geo_batch_correction(mergedset, batch, diagnosis, idtype)

Arguments

mergedset

A large Expression Set. The output of the function 'geo_merge'. Please note, that mergedset holds data, which are not yet batch corrected.

batch

Takes the results from create_batch() as input.

diagnosis

A vector of integers, holding the binary outcome variable (0 = "Control", 1 = "Target").

idtype

A character string. The type of ID used to name the genes. One of 'entrez' or 'affy' intended to use either entrez IDs or affy IDs. Caution: when using entrez IDs, missing and duplicated IDs are being removed!

Details

This function takes a Bioconductor's ExpressionSet class (the output of the function 'geo_merge') and outputs a batch corrected matrix containing expression data. In order to correct for occurring batch effects and other unwanted variation in high-throughput experiments the 'ComBat' function from the sva package is conducted. The affy probes are mapped to their Entrez IDs. Thereby, empty and replicated character strings are removed.

References

W.E. Johnson, C. Li, and A. Rabinovic. Adjusting batch effects in microarray data using empirical bayes methods. Biostatistics, 8(1):118–127, 2007. Jeffrey T. Leek, W. Evan Johnson, Hilary S. Parker, Elana J. Fertig, Andrew E. Jaffe, John D. Storey, Yuqing Zhang and Leonardo Collado Torres (2019). sva: Surrogate Variable Analysis. R package version 3.30.1.


miracum/clearly-sigident.preproc documentation built on June 28, 2022, 3:17 p.m.