Description Usage Arguments Details Value Author(s) References See Also Examples
The batch function of building Hidden Semi Markov Model (HSMM) to estimate the most likely state sequences for multiple input data series.
1 | biomvRhsmm(x, maxk=NULL, maxbp=NULL, J=3, xPos=NULL, xRange=NULL, usePos='start', emis.type='norm', com.emis=FALSE, xAnno=NULL, soj.type='gamma', q.alpha=0.05, r.var=0.75, useMC=TRUE, cMethod='F-B', maxit=1, maxgap=Inf, tol=1e-06, grp=NULL, cluster.m=NULL, avg.m='median', prior.m='cluster', trim=0, na.rm=TRUE)
|
x |
input data matrix, or a |
maxk |
maximum length of stay for the sojourn distribution |
maxbp |
maximum length of stay in bp for the sojourn distribution, given positional information specified in |
J |
number of states |
xPos |
a vector of positions for each |
xRange |
a |
usePos |
character value to indicate whether the 'start', 'end' or 'mid' point position should be used to estimate the sojourn distribution |
emis.type |
type of the emission distribution, only the following types are supported: 'norm', 'mvnorm', 'pois', 'nbinom', 'mvt', 't' |
com.emis |
whether to set a common emission prior across different seqnames. if TRUE, the emission will not be updated during individual runs. |
xAnno |
a optional |
soj.type |
type of the sojourn distribution, only the following types are supported: 'nonpara', 'gamma', 'pois', 'nbinom' |
q.alpha |
a quantile factor controlling the estimated prior for the mean of the emission of each states, |
r.var |
a ratio factor controlling the estimated prior for the variance / covariance structure of each states. A value larger than 1 tend to allow larger variation in extreme states; a value smaller than 1 will decrease the probability of having extreme state |
useMC |
TRUE if |
cMethod |
C algorithm used for the most likely state sequence, 'F-B' or 'Viterbi' |
maxit |
max iteration of the EM run with Forward-Backward algorithm |
maxgap |
max distance between neighbouring feature to consider a split |
tol |
tolerance level of the likelihood change to terminate the EM run |
grp |
vector of group assignment for each sample, with a length the same as columns in the data matrix, samples within each group would be processed simultaneously if a multivariate emission distribution is available |
cluster.m |
clustering method for prior grouping, possible values are 'ward','single','complete','average','mcquitty','median','centroid' |
avg.m |
method to calculate average value for each segment, 'median' or 'mean' possibly trimmed |
prior.m |
method to select emission prior for each state, 'quantile' uses different levels of quantile; the 'cluster' method uses clara function from cluster |
trim |
the fraction (0 to 0.5) of observations to be trimmed from each end of x before the mean is computed. Values of trim outside that range are taken as the nearest endpoint. |
na.rm |
|
This is the batch function of building Hidden Semi Markov Model (HSMM) to estimating the most likely state sequences for multiple input data series.
The function will sequentially process each region identified by the distinctive seqnames
in x
or in xRange
if available, or assuming all data from the same region.
A second layer of stratification is introduced by the argument grp
, which could be used to reflect experimental design.
The assumption is that profiles from the same group could be considered homogeneous, thus processed together if emis.type
is compatible (currently only with 'mvnorm').
Argument for the sojourn density will be initialized as flat prior or estimated from other data before calling the work horse function hsmmRun
.
Then for each batch run results will be combined and eventually a biomvRCNS-class
object will be returned.
See the vignette for more details and examples.
A biomvRCNS-class
object:
|
Object of class |
|
Object of class |
|
Object of class |
Yang Du
Guedon, Y. (2003). Estimating hidden semi-Markov chains from discrete sequences. Journal of Computational and Graphical Statistics, 12(3), 604-639.
1 2 3 4 5 6 7 8 9 10 11 12 | data(coriell)
xgr<-GRanges(seqnames=paste('chr', coriell[,2], sep=''), IRanges(start=coriell[,3], width=1, names=coriell[,1]))
values(xgr)<-DataFrame(coriell[,4:5], row.names=NULL)
xgr<-sort(xgr)
reshsmm<-biomvRhsmm(x=xgr, maxbp=4E4, J=3, soj.type='gamma', emis.type='norm', grp=c(1,2))
## access model parameters
reshsmm@param$soj.par
reshsmm@param$emis.par
## states assigned and associated probabilities
mcols(reshsmm@x)[,-(1:2)]
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.