Fit a Hidden Markov Model to multiple ChIP-seq samples

Share:

Description

Fit a HMM to multiple ChIP-seq samples to determine the combinatorial state of genomic regions. Input is a list of uniHMMs generated by callPeaksUnivariate.

Usage

1
2
3
4
callPeaksMultivariate(hmms, use.states, max.states = NULL, per.chrom = TRUE,
  chromosomes = NULL, eps = 0.01, post.cutoff = NULL,
  keep.posteriors = TRUE, num.threads = 1, max.time = NULL,
  max.iter = NULL, keep.densities = FALSE, verbosity = 1)

Arguments

hmms

A list of uniHMMs generated by callPeaksUnivariate, e.g. list(hmm1,hmm2,...) or a vector of files that contain such objects, e.g. c("file1","file2",...).

use.states

A data.frame with combinatorial states which are used in the multivariate HMM, generated by function stateBrewer. If both use.states and max.states are NULL, the maximum possible number of combinatorial states will be used.

max.states

Maximum number of combinatorial states to use in the multivariate HMM. The states are ordered by occurrence as determined from the combination of univariate state calls.

per.chrom

If per.chrom=TRUE chromosomes will be treated separately. This tremendously speeds up the calculation but results might be noisier as compared to per.chrom=FALSE, where all chromosomes are concatenated for the HMM.

chromosomes

A vector specifying the chromosomes to use from the models in hmms. The default (NULL) uses all available chromosomes.

eps

Convergence threshold for the Baum-Welch algorithm.

post.cutoff

False discovery rate. The default NULL means that the state with maximum posterior probability will be chosen, irrespective of its absolute probability.

keep.posteriors

If set to TRUE, posteriors will be available in the output. This is useful to change the post.cutoff later, but increases the necessary disk space to store the result immense.

num.threads

Number of threads to use. Setting this to >1 may give increased performance.

max.time

The maximum running time in seconds for the Baum-Welch algorithm. If this time is reached, the Baum-Welch will terminate after the current iteration finishes. The default NULL is no limit.

max.iter

The maximum number of iterations for the Baum-Welch algorithm. The default NULL is no limit.

keep.densities

If set to TRUE (default=FALSE), densities will be available in the output. This should only be needed debugging.

verbosity

Verbosity level for the fitting procedure. 0 - No output, 1 - Iterations are printed.

Details

Emission distributions from the univariate HMMs are used with a Gaussian copula to generate a multivariate emission distribution for each combinatorial state. This multivariate distribution is then kept fixed and the transition probabilities are fitted with a Baum-Welch. Please refer to our manuscript at http://dx.doi.org/10.1101/038612 for a detailed description of the method.

Value

A multiHMM object.

Author(s)

Aaron Taudt, Maria Colome Tatche

See Also

multiHMM, callPeaksUnivariate, callPeaksReplicates

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
# Get example BAM files for 2 different marks in hypertensive rat
file.path <- system.file("extdata","euratrans", package='chromstaRData')
files <- list.files(file.path, full.names=TRUE, pattern='SHR.*bam$')[c(1:2,6:7)]
# Construct experiment structure
exp <- data.frame(file=files, mark=c("H3K27me3","H3K27me3","H3K4me3","H3K4me3"),
                 condition=rep("SHR",4), replicate=c(1:2,1:2), pairedEndReads=FALSE,
                 controlFiles=NA)
states <- stateBrewer(exp, mode='combinatorial')
# Bin the data
data(rn4_chrominfo)
binned.data <- list()
for (file in files) {
 binned.data[[basename(file)]] <- binReads(file, binsizes=1000, experiment.table=exp,
                                              assembly=rn4_chrominfo, chromosomes='chr12')
}
# Obtain the univariate fits
models <- list()
for (i1 in 1:length(binned.data)) {
 models[[i1]] <- callPeaksUnivariate(binned.data[[i1]], max.time=60, eps=1)
}
# Call multivariate peaks
multimodel <- callPeaksMultivariate(models, use.states=states, eps=1, max.time=60)
# Check some plots
heatmapTransitionProbs(multimodel)
heatmapCountCorrelation(multimodel)

Want to suggest features or report bugs for rdrr.io? Use the GitHub issue tracker.