splitDataByGene | R Documentation |
This function splits the methylation data into regions
based on the genes. The annotations are coming from the Bioconductor
package annnotatr
.
splitDataByGene( dat, chr, organism = "human", build = "hg38", types = "promoter", gap = -1, min.cpgs = 50, max.cpgs = 2000, verbose = TRUE )
dat |
a data frame with rows as individual CpGs appearing
in all the samples. The first 4 columns should contain the information of
|
chr |
character vector containing the chromosome information. Its length
should be equal to the number of rows in |
organism |
character defining the organism of interest
Only Homo sapiens ( |
build |
character defining the version of the genome build on which the
methylation data have been mapped. By default, the build is set to
|
types |
character vector defining the type of genic annotations to use among the following options:
|
gap |
this integer defines the maximum gap allowed between two regions
to be considered as overlapping.
According to the |
min.cpgs |
positive integer defining the minimum number of CpGs within a region for the algorithm to perform optimally. The default value is 50. |
max.cpgs |
positive integer defining the maximum number of CpGs within a region for the algorithm to perform optimally. The default value is 2000. |
verbose |
logical indicates if the algorithm should provide progress report information. The default value is TRUE. |
A named list
of data.frame
containing the data of each
independent region.
Audrey Lemaçon
#------------------------------------------------------------# data(RAdat) # Add a column containing the chromosome information RAdat$Chr <- "chr4" RAdat.f <- na.omit(RAdat[RAdat$Total_Counts != 0, ]) results <- splitDataByGene(dat = RAdat.f, chr = rep(x = "chr1", times = nrow(RAdat.f)), verbose = FALSE)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.