baumWelch: Inferring the parameters of a dag Hidden Markov Model via the...
In dagHMM: Directed Acyclic Graph HMM with TAN Structured Emissions

View source: R/baumWelch.R

baumWelch

R Documentation

Inferring the parameters of a dag Hidden Markov Model via the Baum-Welch algorithm

Description

For an initial Hidden Markov Model (HMM) with some assumed initial parameters and a given set of observations at all the nodes of the dag, the Baum-Welch algorithm infers optimal parameters to the HMM. Since the Baum-Welch algorithm is a variant of the Expectation-Maximisation algorithm, the algorithm converges to a local solution which might not be the global optimum. Note that if you give the training and validation data, the function will message out AUC and AUPR values after every iteration. Also, validation data must contain more than one instance of either of the possible states

Usage

baumWelch(
  hmm,
  observation,
  kn_states = NULL,
  kn_verify = NULL,
  maxIterations = 50,
  delta = 1e-05,
  pseudoCount = 1e-100
)

Arguments

`hmm`	hmm Object of class List given as output by `initHMM`
`observation`	Dataframe containing the discritized character values of only covariates at each node. Column names of dataframe should be same as the covariate names. Missing values should be denoted by "NA".
`kn_states`	(Optional) A (L * 2) dataframe where L is the number of training nodes where state values are known. First column should be the node number and the second column being the corresponding known state values of the nodes
`kn_verify`	(Optional) A (L * 2) dataframe where L is the number of validation nodes where state values are known. First column should be the node number and the second column being the corresponding known state values of the nodes
`maxIterations`	(Optional) The maximum number of iterations in the Baum-Welch algorithm. Default is 50
`delta`	(Optional) Additional termination condition, if the transition and emission matrices converge, before reaching the maximum number of iterations (`maxIterations`). The difference of transition and emission parameters in consecutive iterations must be smaller than `delta` to terminate the algorithm. Default is 1e-5
`pseudoCount`	(Optional) Adding this amount of pseudo counts in the estimation-step of the Baum-Welch algorithm. Default is 1e-100 (Don't keep it zero to avoid numerical errors)

Value

List of three elements, first being the infered HMM whose representation is equivalent to the representation in initHMM, second being a list of statistics of algorithm and third being the final state probability distribution at all nodes.

Examples


library(bnlearn)

tmat = matrix(c(0,0,1,0,0,0,0,1,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0),
               5,5, byrow= TRUE ) #for "X" (5 nodes) shaped dag
states = c("P","N") #"P" represent cases(or positive) and "N" represent controls(or negative)
bnet = model2network("[A][C|A:B][D|A:C][B|A]") #A is the target variable while
                                               #B, C and D are covariates.
obsvA=data.frame(list(B=c("L","H","H","L","L"),C=c("H","H","L","L","H"),D=c("L","L","L","H","H")))
hmmA = initHMM(States=states, dagmat= tmat, net=bnet, observation=obsvA)
kn_st = data.frame(node=c(2),state=c("P"),stringsAsFactors = FALSE)
                   #state at node 2 is known to be "P"
kn_vr = data.frame(node=c(3,4,5),state=c("P","N","P"),stringsAsFactors = FALSE)
                   #state at node 3,4,5 are "P","N","P" respectively
learntHMM= baumWelch(hmm=hmmA,observation=obsvA,kn_states=kn_st, kn_verify=kn_vr)

dagHMM documentation built on Jan. 11, 2023, 1:13 a.m.