read-methods: read file(s) to a methylrawList or methylraw object

Description Usage Arguments Value Details Examples

Description

The function reads a list of files or files with methylation information for bases/region in the genome and creates a methylrawList or methylraw object

Usage

1
2
  read(location,sample.id,assembly,pipeline="amp",header=T,skip=0,sep="",
    context="CpG",resolution="base",treatment)

Arguments

location

file location(s), either a list of locations (each a character string) or one location string

sample.id

sample.id(s)

assembly

a string that defines the genome assembly such as hg18, mm9

header

if the input file has a header or not (default: TRUE)

skip

number of lines to skip when reading. Can be set to 1 for bed files with track line (default: 0)

sep

seperator between fields, same as read.table argument (default: "")

pipeline

name of the alignment pipeline, it can be either "amp" or "bismark". The methylation text files generated from other pipelines can be read as generic methylation text files by supplying a named list argument as "pipeline" argument. The named list should containt column numbers which denotes which column of the text file corresponds to values and genomic location of the methylation events. See Details for more.

resolution

designates whether methylation information is base-pair resolution or regional resolution. allowed values 'base' or 'region'. Default 'base'

treatment

a vector contatining 0 and 1 denoting which samples are control which samples are test

context

methylation context string, ex: CpG,CpH,CHH, etc. (default:CpG)

Value

returns methylRaw or methylRawList

Details

When pipeline argument is a list, it is exptected to provide a named list with following names. 'fraction' is a logical value, denoting if the column frequency of Cs has a range from [0-1] or [0-100]. If true it assumes range is [0-1]. 'chr.col" is the number of the column that has chrosome string. 'start.col' is the number of the column that has start coordinate of the base/region of the methylation event. 'end.col' is the number of the column that has end coordinate of the base/region of the methylation event. 'coverage.col' is the number of the column that has read coverage values. 'strand.col' is the number of the column that has strand information, the strand information in the file has to be in the form of '+' or '-', 'freqC.col' is the number of the column that has the frequency of Cs. See examples to see how to read a generic methylation text file.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
# this is a list of example files, ships with the package
# for your own analysis you will just need to provide set of paths to files
#you will not need the "system.file(..."  part
file.list=list( system.file("extdata", "test1.myCpG.txt", package = "methylKit"),
                system.file("extdata", "test2.myCpG.txt", package = "methylKit"),
                system.file("extdata", "control1.myCpG.txt", package = "methylKit"),
                system.file("extdata", "control2.myCpG.txt", package = "methylKit") )

# read the files to a methylRawList object: myobj
myobj=read( file.list,
            sample.id=list("test1","test2","ctrl1","ctrl2"),assembly="hg18",treatment=c(1,1,0,0))

# read one file as methylRaw object
myobj=read( file.list[[1]],
            sample.id="test1",assembly="hg18")

# read a generic text file containing CpG methylation values
# let's first look at the content of the file
generic.file=system.file("extdata", "generic1.CpG.txt", package = "methylKit")
read.table(generic.file,header=TRUE)

# And this is how you can read that generic file as a methylKit object
 myobj=read( generic.file,pipeline=list(fraction=FALSE, chr.col=1,start.col=2,end.col=2,coverage.col=4,strand.col=3,freqC.col=5),
            sample.id="test1",assembly="hg18")

fortunatobianconi/methylkit documentation built on May 16, 2019, 1:51 p.m.