preprocess_bs_seq: Pre-process BS-Seq data in any given format

Description Usage Arguments Value Additional Info Author(s) See Also Examples

Description

preprocess_bs_seq is a general function for reading and preprocessing BS-Seq data. If a vector of files is given, these are considered as replicates and are pooled together. Finally, noisy reads are discarded.

Usage

1
2
preprocess_bs_seq(files, file_format = "encode_rrbs", chr_discarded = NULL,
  min_bs_cov = 2, max_bs_cov = 1000)

Arguments

files

A vector of filenames containing replicate experiments. This can also be just a single replicate.

file_format

A string denoting the file format that the BS-Seq data are stored. Current version allows "encode_rrbs" or "bismark_cov" formats.

chr_discarded

A vector with chromosome names to be discarded.

min_bs_cov

The minimum number of reads mapping to each CpG site. CpGs with less reads will be considered as noise and will be discarded.

max_bs_cov

The maximum number of reads mapping to each CpG site. CpGs with more reads will be considered as noise and will be discarded.

Value

A GRanges object. The GRanges object contains two additional metadata columns:

These columns can be accessed as follows: granges_object$total_reads

Additional Info

Information about the file formats can be found in the following links:

Encode RRBS format: http://rohsdb.cmb.usc.edu/GBshape/cgi-bin/hgTables?db=hg19&hgta_group=regulation&hgta_track=wgEncodeHaibMethylRrbs&hgta_table=wgEncodeHaibMethylRrbsBcbreast0203015BiochainSitesRep2&hgta_doSchema=describe+table+schema

Bismark Cov format: http://rnbeads.mpi-inf.mpg.de/data/RnBeads.pdf

Author(s)

C.A.Kapourani C.A.Kapourani@ed.ac.uk

See Also

read_bs_bismark_cov, read_bs_encode_haib pool_bs_seq_rep

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
####
bs_file1 <- system.file("extdata", "rrbs.bed", package = "processHTS")
bs_file2 <- system.file("extdata", "rrbs.bed", package = "processHTS")
bs_files <- c(bs_file1, bs_file2)
pool_data <- preprocess_bs_seq(files=bs_files)

####
bs_file1 <- system.file("extdata", "bism_rep1.bed", package = "processHTS")
bs_file2 <- system.file("extdata", "bism_rep2.bed", package = "processHTS")
bs_files <- c(bs_file1, bs_file2)
pool_data <- preprocess_bs_seq(files=bs_files, file_format="bismark_cov")

andreaskapou/processHTS documentation built on May 12, 2019, 3:33 a.m.