createBinData: Create a BinData object by merging lists of ChIP and input...

Description Usage Arguments Value Note Author(s) Examples

Description

This function create a BinData object by merging ChIP and input bin-level counts with external M/GC/N text files.

Usage

1
2
3
createBinData(dat.chip, dat.input, mfile, gcfile, nfile, m.suffix = NULL,
  gc.suffix = NULL, n.suffix = NULL, chrlist = NULL,
  dataType = "unique")

Arguments

dat.chip

Either a list of the ChIP bin level data for each chromosome, or a character string of the file name including the ChIP bin level data. If the ChIP bin level file name is provided, the file must contain at least two columns, where the chromosome information is in the first column, and the bin level counts are in the last column.

dat.input

A list of the input bin level data for each chromosome, or a character string for the input bin level data counts. The structure is the same as "dat.chip".

mfile

A character value. If "m.suffix=NULL", this is the file name of the genome-wide M file. Otherwise, this is the common prefix (including relative path) for all chromosome-level M files.

gcfile

A character value. If "gc.suffix=NULL", this is the file name of the genome-wide GC file. Otherwise, this is the common prefix (including relative path) for all chromosome-level GC files.

nfile

A character value. If "n.suffix=NULL", this is the file name of the genome-wide N file. Otherwise, this is the common prefix (including relative path) for all chromosome-level N files.

m.suffix

A character value. If not NULL, this is the suffix of the chromosome-wise M files. The chromosome-level file has to be named "chrX_m.suffix".

gc.suffix

A character value. If not NULL, this is the suffix of the chromosome-wise GC files. The chromosome-level file has to be named "chrX_gc.suffix".

n.suffix

A character value. If not NULL, this is the suffix of the chromosome-wise N files. The chromosome-level file has to be named "chrX_n.suffix".

chrlist

A list of the chromosomes that is imported. If "NULL", all chromosomes specified by "name(dat.chip)" are imported.

dataType

A character value of either "unique" or "multi".

Value

A BinData-class object.

Note

When .suffix is null, the corresponding genome-wise file must have three columns, with the first column being the chromosome names, the second column being the genome coordinates, and the third column being the corresponding scores. In contrast, when .suffix is not null, then each chromosome-level M/GC/N file should only contain two columns, with the first column being the genome coordinates and the second column being the scores.

Author(s)

Chandler Zuo zuo@stat.wisc.edu

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
data(tagdat_chip)
data(tagdat_input)
dat_chip <- tag2bin(tagdat_chip,binS=100,fragL=100)
dat_input <- tag2bin(tagdat_input,binS=100,fragL=100)

numBins <- as.integer(runif(5,190,220))
mapdat <- gcdat <- ndat <- list(1:5)
allmapdat <- allgcdat <- allndat <- NULL
for(i in 1:5){
  mapdat[[i]] <- data.frame(
                            pos=(0:(numBins[i]-1))*100,
                            M=runif(numBins[i],0.9,1)
                            )
  gcdat[[i]] <- data.frame(
                           pos=(0:(numBins[i]-1))*100,
                           GC=runif(numBins[i],0.5,1)
                           )
  ndat[[i]] <- data.frame(
                          pos=(0:(numBins[i]-1))*100,
                          N=rbinom(numBins[i],1,0.01)
                          )
  allmapdat <- rbind(allmapdat,
                     cbind(paste("chr",i,sep=""),mapdat[[i]]))
  allgcdat <- rbind(allgcdat,
                    cbind(paste("chr",i,sep=""),gcdat[[i]]))
  allndat <- rbind(allndat,
                   cbind(paste("chr",i,sep=""),ndat[[i]]))
  
  write.table( mapdat[[i]], file = paste("map_chr",i,".txt",sep=""),
              sep = "\t", row.names = FALSE, col.names = FALSE)
  write.table( gcdat[[i]], file = paste("gc_chr",i,".txt",sep=""),
              sep = "\t", row.names = FALSE, col.names = FALSE)
  write.table( ndat[[i]], file = paste("n_chr",i,".txt",sep=""),
              sep = "\t", row.names = FALSE, col.names = FALSE)
}
write.table( allmapdat, file = "allmap.txt" , sep = "\t", row.names = FALSE,
            col.names = FALSE )
write.table( allgcdat,file = "allgc.txt" , sep = "\t", row.names = FALSE,
            col.names = FALSE )
write.table( allndat,file = "alln.txt", sep = "\t", row.names = FALSE,
            col.names = FALSE )

bindata1 <- createBinData( dat_chip, dat_input, mfile = "map_",
                          gcfile = "gc_", nfile = "n_", m.suffix = ".txt",
                          gc.suffix = ".txt", n.suffix = ".txt",
                          chrlist = NULL, dataType = "unique" )
bindata2 <- createBinData( dat_chip, dat_input, mfile = "allmap.txt",
                          gcfile="gc_", nfile = "n_", m.suffix = NULL,
                          gc.suffix = ".txt", n.suffix = ".txt",
                          chrlist = NULL, dataType = "unique" )
bindata3 <- createBinData( dat_chip, dat_input, mfile = "map_",
                          gcfile = "allgc.txt", nfile="n_", m.suffix = ".txt",
                          gc.suffix = NULL, n.suffix = ".txt",
                          chrlist = NULL, dataType = "unique")
bindata4 <- createBinData( dat_chip, dat_input, mfile = "map_",
                          gcfile = "gc_", nfile = "alln.txt", m.suffix = ".txt",
                          gc.suffix = ".txt", n.suffix = NULL,
                          chrlist = NULL, dataType = "unique")

for(i in 1:5){
  for(j in c("map_","gc_","n_")){
    file.remove(paste(j,"chr",i,".txt",sep=""))
  }
}
file.remove("allmap.txt")
file.remove("alln.txt")
file.remove("allgc.txt")

chandlerzuo/cssp documentation built on May 13, 2019, 3:23 p.m.