unite-methods: unite methylRawList to a single table

uniteR Documentation

unite methylRawList to a single table

Description

This functions unites methylRawList and methylRawListDB objects that only bases with coverage from all samples are retained. The resulting object is either of class methylBase or methylBaseDB depending on input.

Usage

unite(
  object,
  destrand = FALSE,
  min.per.group = NULL,
  chunk.size = 1e+06,
  mc.cores = 1,
  save.db = FALSE,
  ...
)

## S4 method for signature 'methylRawList'
unite(
  object,
  destrand = FALSE,
  min.per.group = NULL,
  chunk.size = 1e+06,
  mc.cores = 1,
  save.db = FALSE,
  ...
)

## S4 method for signature 'methylRawListDB'
unite(
  object,
  destrand = FALSE,
  min.per.group = NULL,
  chunk.size = 1e+06,
  mc.cores = 1,
  save.db = TRUE,
  ...
)

Arguments

object

a methylRawList or methylRawListDB object to be merged by common locations covered by reads

destrand

if TRUE, reads covering both strands of a CpG dinucleotide will be merged, do not set to TRUE if not only interested in CpGs (default: FALSE). If the methylRawList object contains regions rather than bases setting destrand to TRUE will have no effect.

min.per.group

an integer denoting minimum number of samples per replicate needed to cover a region/base. By default only regions/bases that are covered in all samples are united as methylBase object, however by supplying an integer for this argument users can control how many samples needed to cover region/base to be united as methylBase object. For example, if min.per.group set to 2 and there are 3 replicates per condition, the bases/regions that are covered in at least 2 replicates will be united and missing data for uncovered bases/regions will appear as NAs.

chunk.size

Number of rows to be taken as a chunk for processing the methylRawListDB objects, default: 1e6

mc.cores

number of cores to use when processing methylRawListDB objects, default: 1, but always 1 for Windows)

save.db

A Logical to decide whether the resulting object should be saved as flat file database or not, default: explained in Details sections

...

optional Arguments used when save.db is TRUE

suffix A character string to append to the name of the output flat file database, only used if save.db is true, default actions: The default suffix is a 13-character random string appended to the fixed prefix “methylBase”, e.g. “methylBase_16d3047c1a254.txt.bgz”.

dbdir The directory where flat file database(s) should be stored, defaults to getwd(), working directory for newly stored databases and to same directory for already existing database

dbtype The type of the flat file database, currently only option is "tabix" (only used for newly stored databases)

Value

a methylBase or methylBaseDB object depending on input

Details

The parameter chunk.size is only used when working with methylRawDB or methylRawListDB objects, as they are read in chunk by chunk to enable processing large-sized objects which are stored as flat file database. Per default the chunk.size is set to 1M rows, which should work for most systems. If you encounter memory problems or have a high amount of memory available feel free to adjust the chunk.size.

The parameter save.db is per default TRUE for methylDB objects as methylRawListDB, while being per default FALSE for methylRawList. If you wish to save the result of an in-memory-calculation as flat file database or if the size of the database allows the calculation in-memory, then you might change the value of this parameter.

Examples


 data(methylKit)
 ## Following 
 my.methylBase=unite(methylRawList.obj) 
 my.methylBase=unite(methylRawList.obj,destrand=TRUE)
 
 

al2na/methylKit documentation built on Jan. 12, 2025, 7:56 a.m.