splitGAlignmentsByCut: split bams into nucleosome free, mononucleosome, dinucleosome...

Description Usage Arguments Value Author(s) References Examples

Description

use random forest to split the reads into nucleosome free, mononucleosome, dinucleosome and trinucleosome. The features used in random forest including fragment length, GC content, and UCSC phastCons conservation scores.

Usage

1
2
3
4
5
6
splitGAlignmentsByCut(obj, txs, genome, conservation, breaks = c(0, 100, 180,
  247, 315, 473, 558, 615, Inf), labels = c("NucleosomeFree", "inter1",
  "mononucleosome", "inter2", "dinucleosome", "inter3", "trinucleosome",
  "others"), labelsOfNucleosomeFree = "NucleosomeFree",
  labelsOfMononucleosome = "mononucleosome", trainningSetPercentage = 0.15,
  cutoff = 0.8, halfSizeOfNucleosome = 80L)

Arguments

obj

an object of GAlignments

txs

GRanges of transcripts

genome

an object of BSgenome

conservation

an object of GScores.

breaks

a numeric vector for fragment size of nucleosome freee, mononucleosome, dinucleosome and trinucleosome. The breaks pre-defined here is following the description of Greenleaf's paper (see reference).

labels

a character vector for labels of the levels of the resulting category.

labelsOfNucleosomeFree, labelsOfMononucleosome

character(1). The label for nucleosome free and mononucleosome.

trainningSetPercentage

numeric(1) between 0 and 1. Percentage of trainning set from top coverage.

cutoff

numeric(1) between 0 and 1. cutoff value for prediction.

halfSizeOfNucleosome

numeric(1) or integer(1). Thre read length will be adjusted to half of the nucleosome size to enhance the signal-to-noise ratio.

Value

a list of GAlignments

Author(s)

Jianhong Ou

References

Buenrostro, J.D., Giresi, P.G., Zaba, L.C., Chang, H.Y. and Greenleaf, W.J., 2013. Transposition of native chromatin for fast and sensitive epigenomic profiling of open chromatin, DNA-binding proteins and nucleosome position. Nature methods, 10(12), pp.1213-1218.

Chen, K., Xi, Y., Pan, X., Li, Z., Kaestner, K., Tyler, J., Dent, S., He, X. and Li, W., 2013. DANPOS: dynamic analysis of nucleosome position and occupancy by sequencing. Genome research, 23(2), pp.341-351.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
library(GenomicRanges)
bamfile <- system.file("extdata", "GL1.bam", 
                       package="ATACseqQC", mustWork=TRUE)
tags <- c("AS", "XN", "XM", "XO", "XG", "NM", "MD", "YS", "YT")
gal1 <- readBamFile(bamFile=bamfile, tag=tags, 
                    which=GRanges("chr1", IRanges(1, 1e6)), 
                    asMates=FALSE)
names(gal1) <- mcols(gal1)$qname
library(BSgenome.Hsapiens.UCSC.hg19)
library(TxDb.Hsapiens.UCSC.hg19.knownGene)
txs <- transcripts(TxDb.Hsapiens.UCSC.hg19.knownGene)
library(phastCons100way.UCSC.hg19)
splitGAlignmentsByCut(gal1, txs=txs, genome=Hsapiens, 
                      conservation=phastCons100way.UCSC.hg19)

jaime11/Transcription-Factor-Footprinting documentation built on May 31, 2019, 8:48 p.m.