segment: Learn a model and produce a segmentation

Description Usage Arguments Value

Description

Learn a model and produce a segmentation

Usage

1
2
3
segment(counts, regions, nstates = NULL, model = NULL, notrain = FALSE,
  collapseInitP = FALSE, nthreads = 1, split4speed = FALSE,
  maxiter = 200, ...)

Arguments

counts

Count matrix or list of count matrices matching with the regions parameter. Each row of the matrix represents a mark and each column a bin resulting from dividing the genomic regions into non-overlapping bins of equal size. The rows of the matrix must be named with the name of the marks and these names must be unique.

regions

GRanges object containing the genomic regions of interest. Each of these regions corresponds to a set of bins and each bin to a column of the count matrix. The binsize is automatically derived by comparing the columns of the count matrix with the width of the regions.

nstates

Number of states to learn.

model

A list with the parameters that describe the HMM. Missing parameters will be learned, and the provided parameters will be used as initial parameters for the learning algorithm. If train==FALSE the parameter set must be complete and no learning will take place.

notrain

If FALSE, the parameters will be learned, otherwise the provided parameters (with the model option) will be used without learning to produce a segmentation.

collapseInitP

In case a model with multiple initial probabilities is provided, should those probabilities be averaged and reduced to one initial probabilities vector? If you are not sure about what this means, don't set this option.

nthreads

number of threads used for learning

split4speed

add artificial splits in the input regions to improve the parallelism of the forward-backward algorithm. Usually the results change very little and the algorithm runs considerably faster, if the number of input regions is smaller than the number of threads. See ?kfoots for more details.

maxiter

Maximum number of iterations for learning.

...

Advanced options for learning. Type epicseg:::advancedOpts to see which options are allowed, and type ?kfoots to see what the options do.

Value

A list with the following arguments:

segments

The segmentation as a GRanges object. The slot names of this object contains a number from 1 to nstates saying which state each segment belongs to.

model

A list containing all the parameters of the model.

posteriors

A matrix of size nstates*ncol(counts) containing the posterior probability that a given datapoint is generated by the given state

states

An integer vector of length ncol(counts) saying which state each bin is associated to (using the posterior decoding algorithm). This vector is used to create the segments argument.

viterbi

Same as states, but using the viterbi algorithm.

loglik

the log-likelihood of the whole dataset.


lamortenera/epicseg documentation built on May 20, 2019, 7:34 p.m.