pool-methods: Pool replicates within groups to a single sample per group

poolR Documentation

Pool replicates within groups to a single sample per group

Description

The function sums up coverage, numCs and numTs values within each group so one representative sample for each group will be created in a new methylBase object

Usage

pool(obj, sample.ids, chunk.size = 1e+06, save.db = FALSE, ...)

## S4 method for signature 'methylBase'
pool(obj, sample.ids, chunk.size = 1e+06, save.db = FALSE, ...)

## S4 method for signature 'methylBaseDB'
pool(obj, sample.ids, chunk.size = 1e+06, save.db = TRUE, ...)

Arguments

obj

methylBase or methylBaseDB object with two groups or more and each group should have multiple samples

sample.ids

a character vector of new sample.ids ex:c("test","control"), should follow the same order as unique treatment vector, and should be equal to the length of the unique treatment vector

chunk.size

Number of rows to be taken as a chunk for processing the methylRawListDB objects, default: 1e6

save.db

A Logical to decide whether the resulting object should be saved as flat file database or not, default: explained in Details sections

...

optional Arguments used when save.db is TRUE

suffix A character string to append to the name of the output flat file database, only used if save.db is true, default actions: The default suffix is a 13-character random string appended to the fixed prefix “methylBase”, e.g. “methylBase_16d3047c1a254.txt.bgz”.

dbdir The directory where flat file database(s) should be stored, defaults to getwd(), working directory for newly stored databases and to same directory for already existing database

dbtype The type of the flat file database, currently only option is "tabix" (only used for newly stored databases)

Value

a methylBase or methylBaseDB object depending on class of input object

Details

The parameter chunk.size is only used when working with methylBaseDB objects, as they are read in chunk by chunk to enable processing large-sized objects which are stored as flat file database. Per default the chunk.size is set to 1M rows, which should work for most systems. If you encounter memory problems or have a high amount of memory available feel free to adjust the chunk.size.

The parameter save.db is per default TRUE for methylDB objects as methylBaseDB, while being per default FALSE for methylBase. If you wish to save the result of an in-memory-calculation as flat file database or if the size of the database allows the calculation in-memory, then you might want to change the value of this parameter.

Author(s)

Altuna Akalin

Examples


data(methylKit)

# methylBase.obj has two groups, each group has two samples,
# the following function will pool the samples in each group
# so that each group will be represented by one pooled sample
pooled.methylBase=pool(methylBase.obj,sample.ids=c("test","control"))


al2na/methylKit documentation built on Nov. 30, 2024, 5:44 p.m.