score.hmm: Score an alignment using a general phylo-HMM

Description Usage Arguments Value Author(s) Examples

View source: R/hmm.R

Description

Produce likelihood of an alignment given a phylo-HMM, posterior probabilities of phylo-HMM states across an alignment, and predict states using Viterbi algorithm

Usage

1
2
score.hmm(msa, mod, hmm, states = NULL, viterbi = TRUE, ref.idx = 1,
  reflect.strand = NULL, features = NULL, quiet = (!is.null(features)))

Arguments

msa

An object of type msa

mod

A list of tree model objects, corresponding to each state in the phylo-HMM

hmm

An object of type hmm describing transitions between states, equilbrium frequencies, initial frequencies, and optionally end frequencies

states

A vector of characters naming the states of interest in the phylo-HMM, or a vector of integers corresponding to states in the transition matrix. The post.probs will give the probability of any of these states, and the viterbi regions reflect regions where the state is predicted to be any of these states. If NULL, the post.probs will be a data frame with probabilities of each state at each site, and the viterbi algorithm will give the predicted state at every site.

viterbi

A logical value indicating whether to predict a path through the phylo-HMM using the Viterbi algorithm.

ref.idx

An integer value. Use the coordinate frame of the given sequence. Default is 1, indicating the first sequence in the alignment. A value of 0 indicates the coordinate frame of the entire alignment.

reflect.strand

Given an hmm describing the forward strand, create a larger HMM that allows for features on both strands by "reflecting" the original HMM about the specified states. States can be described as a vector of integers or characters in the same manner as states argument (above). The new hmm will be used for prediction on both strands. NOTE: if reflect.strand is provided, the first state is treated as a "default" state and is implicitly included in the reflect.strand list! Also, reflection is done assuming a reversible model.

features

If non-NULL, compute the likelihood of each feature under the phylo-HMM.

quiet

If TRUE, suppress printing of progress information.

Value

If features is not NULL, returns a numeric vector with one value per feature, giving the likelihood of the feature under the phylo-HMM.

Otherwise, returns a list with some or all of the following arguments (depending on options):

in.states

An object of type feat which describes regions which fall within the interesting states specified in the states parameter, as determined by the Viterbi algorithm.

post.prob.wig

A data frame giving a coordinate and posterior probibility that each site falls within an interesting state.

likelihood

The likelihood of the data under the estimated model.

Author(s)

Melissa J. Hubisz and Adam Siepel

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
exampleArchive <- system.file("extdata", "examples.zip", package="rphast")
files <- c("ENr334-100k.maf", "rev.mod", "gencode.ENr334-100k.gff")
unzip(exampleArchive, files)
# make "conserved" and "neutral" models and a phylo-HMM that describes
# transitions between them, and predict conserved elements (this
# is the same thing that phastCons does, but can be extended to general
# phylo-HMMs)
align <- read.msa("ENr334-100k.maf")
neutralMod <- read.tm("rev.mod")

# create a conserved model
conservedMod <- neutralMod
conservedMod$tree <- rescale.tree(neutralMod$tree, 0.3)

# create a simple phylo-HMM
state.names <- c("neutral", "conserved")
h <- hmm(matrix(c(0.99, 0.01, 0.01, 0.99), nrow=2,
                dimnames=list(state.names, state.names)),
                eq.freq=c(neutral=0.9, conserved=0.1))
scores <- score.hmm(align, mod=list(neutral=neutralMod,
                             conserved=conservedMod),
                    hmm=h, states="conserved")
# try an alternate approach of comparing likelihoods of genes 
feats <- read.feat("gencode.ENr334-100k.gff")
# plot in a region with some genes
plot.track(list(as.track.feat(scores$in.states, name="hmmScores"),
                as.track.feat(feats[feats$feature=="CDS",], name="genes")),
           xlim=c(41650000, 41680000))
unlink(files)

rphast documentation built on May 1, 2019, 9:26 p.m.