Preprocess_DNAMethylation: The Preprocess_DNAMethylation function
In gevaertlab/EpiMix: EpiMix: an integrative tool for the population-level analysis of DNA methylation

View source: R/GEO_Download_Preprocess.R

Preprocess_DNAMethylation

R Documentation

The Preprocess_DNAMethylation function

Description

Preprocess DNA methylation data from the GEO database.

Usage

Preprocess_DNAMethylation(
  methylation.data,
  met.platform = "EPIC",
  genome = "hg38",
  sample.info = NULL,
  group.1 = NULL,
  group.2 = NULL,
  sample.map = NULL,
  rm.chr = c("chrX", "chrY"),
  MissingValueThresholdGene = 0.2,
  MissingValueThresholdSample = 0.2,
  doBatchCorrection = FALSE,
  BatchData = NULL,
  batch.correction.method = "Seurat",
  cores = 1
)

Arguments

`methylation.data`	matrix of DNA methylation data with CpG in rows and sample names in columns.
`met.platform`	character string indicating the type of the Illumina Infinium BeadChip for collecting the methylation data. Should be either 'HM450' or 'EPIC'. Default: 'EPIC'
`genome`	character string indicating the genome build version for retrieving the probe annotation. Should be either 'hg19' or 'hg38'. Default: 'hg38'.
`sample.info`	dataframe that maps each sample to a study group. Should contain two columns: the first column (named: 'primary') indicating the sample names, and the second column (named: 'sample.type') indicating which study group each sample belongs to (e.g., “Experiment” vs. “Control”, “Cancer” vs. “Normal”). Sample names in the 'primary' column must coincide with the column names of the methylation.data. Please see details for more information. Default: NULL.
`group.1`	character vector indicating the name(s) for the experiment group. The values must coincide with the values in the 'sample.type' of the sample.info dataframe.Please see details for more information. Default: NULL.
`group.2`	character vector indicating the names(s) for the control group. The values must coincide with the values in the 'sample.type' of the sample.info dataframe. Please see details for more information. Default: NULL.
`sample.map`	dataframe for mapping the GEO accession ID (column names) to the actual sample names. Can be the output from the GEO_getSampleMap function. Default: NULL.
`rm.chr`	character vector indicating the probes on which chromosomes to be removed. Default: 'chrX', 'chrY'.
`MissingValueThresholdGene`	threshold for missing values per gene. Genes with a percentage of NAs greater than this threshold are removed. Default: 0.3.
`MissingValueThresholdSample`	threshold for missing values per sample. Samples with a percentage of NAs greater than this threshold are removed. Default: 0.1.
`doBatchCorrection`	logical indicating whether to perform batch correction. If TRUE, the batch data need to be provided.
`BatchData`	dataframe with batch information. Should contain two columns: the first column indicating the actual sample names, the second column indicating the batch. Users are expected to retrieve the batch information from the GEO on their own, but this can also be done using the GEO_getSampleInfo function with the 'group.column' as the column indicating the batch for each sample. Defualt': NULL.
`batch.correction.method`	character string indicating the method that will be used for batch correction. Should be either 'Seurat' or 'Combat'. Default: 'Seurat'.
`cores`	number of CPU cores to be used for batch effect correction. Defaut: 1.

Details

The data preprocessing pipeline includes: (1) eliminating samples and genes with too many NAs, imputing NAs. (2) (optional) mapping the column names of the DNA methylation data to the actual sample names based on the information from 'sample.map'. (3) (optional) removing CpG probes on the sex chromosomes or the user-defined chromosomes. (4) (optional) doing Batch correction. If both sample.info and group.1 and group.2 information are provided, the function will perform missing value estimation and batch correction on group.1 and group.2 separately. This will ensure that the true difference between group.1 and group.2 will not be obscured by missing value estimation and batch correction.

Value

DNA methylation data matrix with probes in rows and samples in columns.

Examples

{
data(MET.data)
data(LUAD.sample.annotation)

Preprocessed_Data <- Preprocess_DNAMethylation(MET.data,
                                                   met.platform = 'HM450',
                                                   sample.info = LUAD.sample.annotation,
                                                   group.1 = 'Cancer',
                                                   group.2 = 'Normal')

}

gevaertlab/EpiMix documentation built on July 20, 2023, 9:28 a.m.

gevaertlab/EpiMix index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

gevaertlab/EpiMix
EpiMix: an integrative tool for the population-level analysis of DNA methylation

Preprocess_DNAMethylation: The Preprocess_DNAMethylation function
In gevaertlab/EpiMix: EpiMix: an integrative tool for the population-level analysis of DNA methylation

The Preprocess_DNAMethylation function

Description

Usage

Arguments

Details

Value

Examples

Related to Preprocess_DNAMethylation in gevaertlab/EpiMix...

R Package Documentation

Browse R Packages

We want your feedback!

gevaertlab/EpiMix EpiMix: an integrative tool for the population-level analysis of DNA methylation

Preprocess_DNAMethylation: The Preprocess_DNAMethylation function In gevaertlab/EpiMix: EpiMix: an integrative tool for the population-level analysis of DNA methylation

The Preprocess_DNAMethylation function

Description

Usage

Arguments

Details

Value

Examples

Related to Preprocess_DNAMethylation in gevaertlab/EpiMix...

R Package Documentation

Browse R Packages

We want your feedback!

gevaertlab/EpiMix
EpiMix: an integrative tool for the population-level analysis of DNA methylation

Preprocess_DNAMethylation: The Preprocess_DNAMethylation function
In gevaertlab/EpiMix: EpiMix: an integrative tool for the population-level analysis of DNA methylation