Description Usage Arguments Details Value Author(s) References See Also Examples
The function finds a sensible set of initial NB HMM parameters by fitting a NB mixture model of K components using the read count data.
1 | nbh_init(count, K, NBM_NIT_MAX = 250, NBM_TOL = 0.001)
|
count |
A vector of integers, conceptaully representing the read counts within bins of chromosome. |
K |
Number of hidden states. |
NBM_NIT_MAX |
Maximum number of EM iterations (Default: 250) for the negative binomial mixture model (NBM) intialization step (See |
NBM_TOL |
Threshold as fraction of increase in likelihood (given the current NBM parameters) comparing with the likelihood from the last iteration. EM for the NBM stops when the improvement is below the threshold (Default: 0.01). |
Because the EM algorithm in HMM tends to fall into local optimal with poor initialization, NB mixture model with K mixture components (K-NBM) is first applied to the data to obtain a reasonable estimate for the HMM parameters. Given the read count vector, the function applied the lower level function nbm_em
(NB mixture model) to find alpha, beta, and mixing proportion of the K NB mixture components. Alpha and beta are the parameters of the NB mixture components initialized as the last K quantiles of the nonzero read counts and 1, respectively. The mixing proportions or component weights (wght) of the NB distributions are first initialized as uniform and after EM optimization are used to form a symmetrical transition probability matrix such that probability of state 1 transitioning to state 2 is equal to the probability of state 2 transitioning to state 1. Such matrix is used as the initial transition probability for the HMM model tranining (See nbh_em
).
A list containing:
alpha |
Alpha paramter of the K NB components optimized using |
beta |
Beta paramter of the K NB components optimized using |
TRANS |
Transition probability intialized as a symmetrical matrix of mixing proportion of the K NB components optimized using |
Yue Li
Rabiner, L. R. (1989). A tutorial on hidden Markov models and selected applications in speech recognition (Vol. 77, pp. 257-286). Presented at the Proceedings of the IEEE. doi:10.1109/5.18626
Christopher Bishop. Pattern recognition and machine learning. Number 605-631 in Information Science and Statisitcs. Springer Science, 2006.
Capp\'e, O. (2001). H2M : A set of MATLAB/OCTAVE functions for the EM estimation of mixtures and hidden Markov models. (http://perso.telecom-paristech.fr/cappe/h2m/)
nbm_em, nbh, nbh.GRanges, nbh_em
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 | # Simulate data
Total_train <- 1000
Total_test <- 200
TRANS_s <- matrix(c(0.9, 0.1, 0.5, 0.5), nrow=2, byrow=TRUE)
alpha_s <- c(2, 2)
beta_s <- c(1, 0.25)
train <- nbh_gen(TRANS_s, alpha_s, beta_s, Total_train)
test <- nbh_gen(TRANS_s, alpha_s, beta_s, Total_test)
nbhInit <- nbh_init(train$count, ncol(TRANS_s))
count <- train$count
label <- train$label
# NBH initialization
nbhInit <- nbh_init(count, ncol(TRANS_s))
TRANS0 <- nbhInit$TRANS
alpha0 <- nbhInit$alpha
beta0 <- nbhInit$beta
# NBH EM
nbh <- nbh_em(count, TRANS0, alpha0, beta0)
map.accuracy <- length(which(max.col(nbh$postprob) == label))/Total_train
vit <- nbh_vit(count, nbh$TRANS, nbh$alpha, nbh$beta)
vit.accuracy <- length(which(vit$class == label))/Total_train
vit_test <- nbh_vit(test$count, nbh$TRANS, nbh$alpha, nbh$beta)
vit_test.accuracy <- length(which(vit_test$class == test$label))/Total_test
nbh_wt_KMLinit <- list(mapAccuracy_train=map.accuracy, vitAccuracy_train=vit.accuracy,
vitLogl_train=vit$logl, vitAccuracy_test=vit_test.accuracy,
vitLogl_test=vit_test$logl)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.