readCounts2GRangesList: Read files of methylation count tables

readCounts2GRangesListR Documentation

Read files of methylation count tables

Description

This function is addressed to read files with methylation count table data commonly generated after the alignment of BS-seq data or found in GEO database

Usage

readCounts2GRangesList(
  filenames = NULL,
  sample.id = NULL,
  pattern = NULL,
  remove = FALSE,
  columns = c(seqnames = NULL, start = NULL, end = NULL, strand = NULL, fraction = NULL,
    percent = NULL, mC = NULL, uC = NULL, coverage = NULL, context = NULL, signal = NULL),
  chromosome.names = NULL,
  chromosomes = NULL,
  verbose = TRUE,
  ...
)

Arguments

filenames

Character vector with the file names

sample.id

Character vector with the names of the samples corresponding to each file

pattern

Chromosome name pattern. Users working on Linux OS can specify the reading of specific lines from each file by using regular expressions.

remove

Logic (TRUE). Usually the supplementary files from GEO datasets are 'gz' compressed. File datasets must be decompressed to be read. The decompressed files are removed after read if this is set 'TRUE'.

columns

Vector of integer numbers denoting the table columns that must be read. The numbers for 'seqnames' (chromosomes), 'start', and 'end' (if different from 'start') columns must be given. The possible fields are: 'seqnames' (chromosomes),'start', 'end', 'strand', 'fraction', percent' (methylation percentage), 'mC' (methylates cytosine), 'uC' (non methylated cytosine), 'coverage', and 'context' (methylation context). These column headers are not required to be in the files. An optional column named 'signal' can be used to include a relevant information about the methylation signal.

chromosome.names

If provided, for each GRanges object, chromosome names will be changed to those provided in 'chromosome.names' applying seqlevels(x) <- chromosome.names'. This option permits to use all the functionality of the function 'seqlevels' defined from package 'GenomeInfoDb', which rename, add, and reorder the seqlevels all at once (see ?seqlevels).

chromosomes

If provided, it must be a character vector with the names of the chromosomes that you want to include in the final GRanges objects.

verbose

If TRUE, prints the function log to stdout

...

Additional parameters for 'fread' function from 'data.table' package

Details

Read tables from files with a table methylation count data using the function fread from the package 'data.table' and and yields a list of GRanges objects with the information provided.

Value

A list of GRanges objects

Examples

## Create a cov file with it's file name including 'gz'
## 'gz' (tarball extension)
filename <- './file.cov'
gr1 <- data.frame(chr = c('chr1', 'chr1'), post = c(1,2),
                strand = c('+', '-'), ratio = c(0.9, 0.5),
                context = c('CG', 'CG'), CT = c(20, 30))
filename <- './file.cov'
write.table(as.data.frame(gr1), file = filename,
            col.names = TRUE, row.names = FALSE, quote = FALSE)

## Read the file. It does not work. Typing mistake: 'fractions'
LR <- try(readCounts2GRangesList(filenames = filename, remove = FALSE,
                            sample.id = 'test',
                            columns = c(seqnames = 1, start = 2,
                                    strand = 3, fractions = 4,
                                    context = 5, coverage = 6)),
                                    silent = TRUE)
file.remove(filename) # Remove the file


genomaths/MethylIT documentation built on Feb. 3, 2024, 1:24 a.m.