iSeq2: Bayesian hierarchical modeling of ChIP-seq data through...

Description Usage Arguments Value Author(s) References See Also Examples

View source: R/iSeq.R

Description

iSeq2 implements the method that models the bin-based tag counts using Poisson-Gamma distribution and the hidden states of the bins using a hidden high-order Ising model.

Usage

1
2
iSeq2(Y,gap=300,burnin=500,sampling=2000,winsize=2,ctcut=0.95,
      a0=1,b0=1,a1=5,b1=1,k=3,verbose=FALSE)

Arguments

Y

Y should be a data frame containing the first 4 columns of the data frame returned by function 'mergetag()'. The columns 1-4 of Y are chromosome IDs, start positions of the bins, end positions of the bins, tag counts in the bins. For one-sample analysis, the tag counts can be the number of forward and reverse tags falling in the bins. For two-sample analysis, tag counts are the adjusted counts of ChIP samples, which are obtained by subtracting the control tag counts from corresponding ChIP tag counts for each bin. If the user provides his/her own Y, Y must be firstly sorted by the chromosome ID, then by the start position, and then by the end position.

gap

gap is the average length of the sequenced DNA fragments. If the distance between two nearest bins is greater than 'gap', a bin with 0 tag count is inserted into the two neighboring bins for modeling.

burnin

The number of MCMC burn-in iterations.

sampling

The number of MCMC sampling iterations. The posterior probability of enriched and non-enriched state is calculated based on the samples generated in the sampling period.

winsize

The parameter to control the order of interactions between genomic regions. For example, winsize = 2, means that genomic region i interacts with regions i-2,i-1,i+1 and i+2. A balance between high sensitivity and low FDR could be achieved by setting winsize = 2.

ctcut

A value used to set the initial state for each genomic bin. If tag count of a bin is greater than quantile(Y[,4],probs=ctcut), its state will be set to 1, otherwise -1. For typical ChIP-seq data, because the major regions are non-enriched, a good value for ctcut could be in the interval (0.9, 0.99).

a0

The scale hyper-parameter of the Gamma prior, alpha0.

b0

The rate hyper-parameter of the Gamma prior, beta0.

a1

The scale hyper-parameter of the Gamma prior, alpha1.

b1

The rate hyper-parameter of the Gamma prior, beta1.

k

The parameter used to control the strength of interaction between neighboring bins, which must be a positive value (k>0). The larger the value of k, the stronger iterations between neighboring bins. The value for k may not be too small (e.g. < 1.0). Otherwise, the Ising system may not be able to reach a super-paramagnetic state.

verbose

A logical variable. If TRUE, the number of completed MCMC iterations is reported.

Value

A list with the following elements.

pp

The posterior probabilities of the bins in the enriched state.

lambda0

The posterior samples of the model parameter lambda0

lambda1

The posterior samples of the model parameter lambda1.

Author(s)

Qianxing Mo [email protected]

References

Qianxing Mo, Faming Liang. (2010). Bayesian modeling of ChIP-chip data through a high-order Ising model. Biometrics 66(4), 1284-94.

Qianxing Mo. (2012). A fully Bayesian hidden Ising model for ChIP-seq data analysis. Biostatistics 13(1), 113-28.

See Also

iSeq1, peakreg,mergetag,plotreg

Examples

1
2
3
4
5
6
7
data(nrsf)
chip = rbind(nrsf$chipFC1592,nrsf$chipFC1862,nrsf$chipFC2002)
mock = rbind(nrsf$mockFC1592,nrsf$mockFC1862,nrsf$mockFC2002)
tagct = mergetag(chip=chip,control=mock,maxlen=80,minlen=10,ntagcut=10)
tagct22 = tagct[tagct[,1]=="chr22",]
res2 = iSeq2(Y=tagct22[,1:4],gap=200, burnin=100,sampling=500,winsize=2,ctcut=0.95,
  a0=1,b0=1,a1=5,b1=1,k=1.0,verbose=FALSE)

iSeq documentation built on Nov. 1, 2018, 3:45 a.m.