processBismarkAln-methods: Get methylation percentage from sorted Bismark alignments

Description Usage Arguments Value Note Examples

Description

The function calls methylation percentage per base from sorted Bismark SAM or BAM files and reads methylation information as methylKit objects. Bismark is a popular aligner for high-throughput bisulfite sequencing experiments and it outputs its results in SAM format by default, and can be converted to BAM. Bismark SAM/BAM format contains aligner specific tags which are absolutely necessary for methylation percentage calling using processBismarkAln. SAM/BAM files from other aligners will not work with this function.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
processBismarkAln(
  location,
  sample.id,
  assembly,
  save.folder = NULL,
  save.context = c("CpG"),
  read.context = "CpG",
  nolap = FALSE,
  mincov = 10,
  minqual = 20,
  phred64 = FALSE,
  treatment = NULL,
  save.db = FALSE,
  verbose = 1
)

## S4 method for signature 'character,character,character'
processBismarkAln(
  location,
  sample.id,
  assembly,
  save.folder = NULL,
  save.context = c("CpG"),
  read.context = "CpG",
  nolap = FALSE,
  mincov = 10,
  minqual = 20,
  phred64 = FALSE,
  treatment = NULL,
  save.db = FALSE,
  verbose = 1
)

## S4 method for signature 'list,list,character'
processBismarkAln(
  location,
  sample.id,
  assembly,
  save.folder = NULL,
  save.context = c("CpG"),
  read.context = "CpG",
  nolap = FALSE,
  mincov = 10,
  minqual = 20,
  phred64 = FALSE,
  treatment = NULL,
  save.db = FALSE,
  verbose = 1
)

Arguments

location

location of sam or bam file(s). If multiple files are given this argument must be a list.

sample.id

the id(s) of samples in the same order as file. If multiple sam files are given this arugment must be a list.

assembly

string that determines the genome assembly. Ex: mm9,hg18 etc. This is just a string for book keeping. It can be any string. Although, when using multiple files from the same assembly, this string should be consistent in each object.

save.folder

The folder which will be used to save methylation call files, if set to NULL no methylation call file will be saved as a text file. The files saved can be read into R in less time using methRead function in methylKit

save.context

A character vector consisting following strings: "CpG","CHG","CHH". The methylation percentages for these methylation contexts will be saved to save.folder

read.context

One of the 'CpG','CHG','CHH' or 'none' strings. Determines what type of methylation context will be read-in to the memory which can be immediately used for analysis. If given as 'none', processBismarkAln will not return any object, but if a save.folder argument given it will save the methylation percentage call files.

nolap

if set to TRUE and the SAM/BAM file has paired-end reads, the one read of the overlapping paired-end read pair will be ignored for methylation calling.

mincov

minimum read coverage to call a methylation status for a base.

minqual

minimum phred quality score to call a methylation status for a base.

phred64

logical (default: FALSE) you will not need to set this TRUE, Currently bismark gives only phred33 scale

treatment

treatment vector only to be used when location and sample.id parameters are lists and you are trying to read-in multiple samples that are related to eachother in down-stream analysis.

save.db

A Logical to decide whether the resulting object should be saved as flat file database or not ( default: FALSE). With the default value, a text file containing methylation values will be saved. If TRUE, database will either be saved to location save.folder or if this is NULL, to a new folder in the current working directory named after this scheme: "methylDB <Date> <3randomlettersornumbers>"

verbose

logical set verbosity of methCall (default: TRUE)

Value

methylRaw or methylRawList object

Note

SAM files should be sorted with samtools sort or unix sort. Other sorting methods can alter the order of fields(columns) in the SAM file and that will result in an error when using processBismarkAln().

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
# reading one bismark file:
my.file=system.file("extdata", "test.fastq_bismark.sorted.min.sam", 
                              package = "methylKit")
obj=processBismarkAln(my.file,"test",assembly="hg18",save.folder=NULL,
                 save.context="CpG",read.context="CpG")
 
# reading multiple files
file.list2=list(system.file("extdata", "test.fastq_bismark.sorted.min.sam", 
package = "methylKit"),
               system.file("extdata", "test.fastq_bismark.sorted.min.sam", 
               package = "methylKit"),
              system.file("extdata", "test.fastq_bismark.sorted.min.sam", 
              package = "methylKit"),
               system.file("extdata", "test.fastq_bismark.sorted.min.sam",
                package = "methylKit"))

 objs=processBismarkAln(location=file.list2
             ,sample.id=list("test1","test2","ctrl1","ctrl1"),assembly="hg18",
             save.folder=NULL,save.context=NULL,read.context="CpG",
             nolap=FALSE,mincov=10,minqual=20,phred64=FALSE,
             treatment=c(1,1,0,0))

methylKit documentation built on Jan. 30, 2021, 2 a.m.