CorrectBatchEffect: The CorrectBatchEffect function

View source: R/TCGA_Download_Preprocess.R

CorrectBatchEffectR Documentation

The CorrectBatchEffect function

Description

top-level wrapper function for batch correction.

Usage

CorrectBatchEffect(
  GEN_Data,
  BatchData,
  batch.correction.method,
  MinInBatch = 5,
  featurePerSet = 50000
)

Arguments

GEN_Data

matrix with methylation.data or gene.expression.data with genes in rows and samples in columns

BatchData

dataframe with two columns: the first column indicates the sample names, and the second column indicates the batch ids.

batch.correction.method

character string. Should be either 'Seurat' or 'Combat'.

MinInBatch

integer indicating the batch size threshold. Batches smaller than this threshold will be removed. Default: 5

featurePerSet

integer indicating the row numbers to split the GEN_Data into small subsets. Default: 50,000

Details

(1) filters the batch data and the molecular data to keep only the overlapped samples. (2) removes extremely small batches. (3) if the molecular data have over 50,000 features (rows), it splits the data into subsets, with 50,000 features in each subset, and perform batch correction on each subset. (4) identify overlapped samples in batch corrected subsets, and merge the subsets into one matrix.

Value

matrix with corrected data


gevaertlab/EpiMix documentation built on July 20, 2023, 9:28 a.m.