splitDataByRegion: Split methylation data into regions based on the spacing of...
In kaiqiong/SOMNiBUS: Smooth modeling of bisulfite sequencing

splitDataByRegion

R Documentation

Split methylation data into regions based on the spacing of CpGs

Description

This function splits the methylation data into regions based on the spacing of CpGs.

Usage

splitDataByRegion(
  dat,
  gap = 1e+06,
  min.cpgs = 50,
  max.cpgs = 2000,
  verbose = TRUE
)

Arguments

`dat`	a data frame with rows as individual CpGs appearing in all the samples. The first 4 columns should contain the information of `Meth_Counts` (methylated counts), `Total_Counts` (read depths), `Position` (Genomic position for the CpG site) and `ID` (sample ID). The covariate information, such as disease status or cell type composition, are listed in column 5 and onwards.
`gap`	positive integer defining the gap width beyond which we consider that two regions are independent. Odd and decimal values will be rounded to the next even numbers (e.g. 8.2 and 8.7 become gaps of 8 and 10 respectively). The default value is `1e+6` (1Mb).
`min.cpgs`	positive integer defining the minimum number of CpGs within a region for the algorithm to perform optimally. The default value is 50.
`max.cpgs`	positive integer defining the maximum number of CpGs within a region for the algorithm to perform optimally. The default value is 2000.
`verbose`	logical indicates if the algorithm should provide progress report information. The default value is TRUE.

Value

A named list of data.frame containing the data of each independent region.

Author(s)

Audrey Lemaçon

Examples

#------------------------------------------------------------#
data(RAdat)
RAdat.f <- na.omit(RAdat[RAdat$Total_Counts != 0, ])
results <- splitDataByRegion( dat=RAdat.f, gap = 1e6, min.cpgs = 5, 
verbose = FALSE)

kaiqiong/SOMNiBUS documentation built on Feb. 24, 2023, 5:38 a.m.