fasthmmfit: Fast gradient descent / stochastic gradient descent algorithm...
In ziphsmm: Zero-Inflated Poisson Hidden (Semi-)Markov Models

Description Usage Arguments Value References Examples

Fast gradient descent / stochastic gradient descent algorithm to learn the parameters in a specialized zero-inflated hidden Markov model, where zero-inflation only happens in State 1. And if there were covariates, they could only be the same ones for the state-dependent log Poisson means and the logit structural zero proportion.

fasthmmfit(y, x = NULL, ntimes = NULL, M, prior_init, tpm_init, emit_init,
  zero_init, yceil = NULL, stochastic = FALSE, nmin = 1000,
  nupdate = 100, power = 0.7, rate = c(1, 0.05), method = "Nelder-Mead",
  hessian = FALSE, ...)

`y`	observed time series values
`x`	matrix of covariates for the log poisson means and logit zero proportion. Default to NULL.
`ntimes`	a vector specifying the lengths of individual, i.e. independent, time series. If not specified, the responses are assumed to form a single time series, i.e. ntimes=length(y).
`M`	number of latent states
`prior_init`	a vector of initial values for prior probability for each state
`tpm_init`	a matrix of initial values for transition probability matrix
`emit_init`	a vector of initial values for the means for each poisson distribution
`zero_init`	a scalar initial value for the structural zero proportion
`yceil`	a scalar defining the ceiling of y, above which the values will be truncated. Default to NULL.
`stochastic`	Logical. Should the stochastic gradient descent methods be used.
`nmin`	a scalar for the minimum number of observations before the first iteration of stochastic gradient descent. Default to 1000.
`nupdate`	a scalar specifying the total number of updates for stochastic gradient descent. Default to 100.
`power`	a scalar representing the power of the learning rate, which should lie between (0.5,1]. Default to 0.7
`rate`	a vector of learning rate in stochastic gradient descent for the logit parameters and log parameters. Default to c(1,0.05).
`method`	method to be used for direct numeric optimization. See details in the help page for optim() function. Default to Nelder-Mead.
`hessian`	Logical. Should a numerically differentiated Hessian matrix be returned? Note that the hessian is for the working parameters, which are the generalized logit of prior probabilities (except for state 1), the generalized logit of the transition probability matrix(except 1st column), the logit of non-zero zero proportions, and the log of each state-dependent poisson means
`...`	Further arguments passed on to the optimization methods

the maximum likelihood estimates of the zero-inflated hidden Markov model

Walter Zucchini, Iain L. MacDonald, Roland Langrock. Hidden Markov Models for Time Series: An Introduction Using R, Second Edition. Chapman & Hall/CRC

#1. no covariates
set.seed(135)
prior_init <- c(0.5,0.2,0.3)
emit_init <- c(10,40,70)
zero_init <- c(0.5,0,0)
omega <- matrix(c(0.5,0.3,0.2,0.4,0.3,0.3,0.2,0.4,0.4),3,3,byrow=TRUE)
result <- hmmsim(n=10000,M=3,prior=prior_init, tpm_parm=omega,
                emit_parm=emit_init,zeroprop=zero_init)
y <- result$series

time <- proc.time()
fit1 <-  fasthmmfit(y,x=NULL,ntimes=NULL,M=3,prior_init,omega,
              emit_init,0.5, hessian=FALSE,
              method="BFGS", control=list(trace=1))
proc.time() - time


#2. with covariates
priorparm <- 0
tpmparm <- c(-2,2)
zeroindex <- c(1,0)
zeroparm <- c(0,-1,1)
emitparm <- c(2,0.5,-0.5,3,0.3,-0.2)
workparm <- c(priorparm,tpmparm,zeroparm,emitparm)

designx <- matrix(rnorm(20000),nrow=10000,ncol=2)
x <- cbind(1,designx) #has to make the additional 1st column of 1 for intercept
result <- hmmsim2(workparm,2,10000,zeroindex,emit_x=designx,zeroinfl_x=designx)
y <- result$series

time <- proc.time()
fit2 <-  fasthmmfit(y=y,x=x,ntimes=NULL,M=2,prior_init=c(0.5,0.5),
              tpm_init=matrix(c(0.9,0.1,0.1,0.9),2,2),
              zero_init=0.4,emit_init=c(7,21), hessian=FALSE,
              method="BFGS", control=list(trace=1))
proc.time() - time
fit2

#3. stochastic gradient descent without covariates
#no covariates
prior_init <- c(0.5,0.2,0.3)
emit_init <- c(10,40,70)
zero_init <- c(0.5,0,0)
omega <- matrix(c(0.5,0.3,0.2,0.4,0.3,0.3,0.2,0.4,0.4),3,3,byrow=TRUE)
result <- hmmsim(n=50000,M=3,prior=prior_init, tpm_parm=omega,
                emit_parm=emit_init,zeroprop=zero_init)
y <- result$series

initparm2 <- c(-1,-0.5,  -0.3,-0.3,-0.4,-0.4,0.5,0.5,  0,2,3,4)
time <- proc.time()
fitst <- fasthmmfit(y=y,x=NULL,ntimes=NULL,M=3,prior_init=c(0.4,0.3,0.3),
              tpm_init=matrix(c(0.6,0.3,0.1,0.3,0.4,0.3,0.1,0.3,0.6),3,3,byrow=TRUE),
              zero_init=0.3,emit_init=c(8,35,67),stochastic=TRUE,
              nmin=1000,nupdate=1000,power=0.6,rate=c(1,0.05))
proc.time() - time
str(fitst)

#with covariates
priorparm <- 0
tpmparm <- c(-2,2)
zeroindex <- c(1,0)
zeroparm <- c(0,-1,1)
emitparm <- c(2,0.5,-0.5,3,0.3,-0.2)
workparm <- c(priorparm,tpmparm,zeroparm,emitparm)

designx <- matrix(rnorm(100000),nrow=50000,ncol=2)
x <- cbind(1,designx) #has to make the additional 1st column of 1 for intercept
result <- hmmsim2(workparm,2,50000,zeroindex,emit_x=designx,zeroinfl_x=designx)
y <- result$series

initparm <- c(0, -1.8,1.8, 0,-0.8,0.8, 1.8,0.6,-0.6,3.1,0.4,-0.3)

time <- proc.time()
fitst <- fasthmmfit(y=y,x=x,ntimes=NULL,M=2,prior_init=c(0.4,0.6),
              tpm_init=matrix(c(0.8,0.2,0.2,0.8),2,2,byrow=TRUE),
              zero_init=0.3,emit_init=c(10,25),stochastic=TRUE,
              nmin=1000,nupdate=1000,power=0.6,rate=c(1,0.05))
proc.time() - time
str(fitst)

initial  value 38851.926930 
iter  10 value 38846.082850
iter  20 value 38844.855440
final  value 38844.852063 
converged
   user  system elapsed 
  0.431   0.016   0.452 
initial  value 36822.892248 
iter  10 value 30297.367028
iter  20 value 27089.098538
final  value 27089.017588 
converged
   user  system elapsed 
  0.869   0.007   0.888 
$prior
             [,1]
[1,] 0.9996722148
[2,] 0.0003277852

$tpm
          [,1]      [,2]
[1,] 0.8840677 0.1159323
[2,] 0.1121305 0.8878695

$zeroparm
           [,1]
[1,] -0.2393444
[2,]  0.1661207
[3,] -1.0501639
[4,]  1.0398675

$emitparm
         [,1]       [,2]      [,3]       [,4]
[1,] 1.969267  0.0233564 0.5069530 -0.4991102
[2,] 3.019989 -0.0245339 0.3009507 -0.2069257

$working_parm
 [1] -8.0228242 -2.0315277  2.0691618 -0.2393444  0.1661207 -1.0501639
 [7]  1.0398675  1.9692665  0.0233564  0.5069530 -0.4991102  3.0199885
[13] -0.0245339  0.3009507 -0.2069257

$negloglik
[1] 27089.02

   user  system elapsed 
  0.048   0.000   0.049 
List of 5
 $ prior       : num [1:3, 1] 0.397 0.302 0.301
 $ tpm         : num [1:3, 1:3] 0.527 0.374 0.148 0.296 0.314 ...
 $ zeroprop    : num 0.492
 $ emit        : num [1:3, 1] 9.9 40 70
 $ working_parm: num [1:1000, 1:12] -0.288 -0.279 -0.282 -0.284 -0.287 ...
   user  system elapsed 
  0.084   0.000   0.084 
List of 5
 $ prior       : num [1:2, 1] 0.422 0.578
 $ tpm         : num [1:2, 1:2] 0.862 0.144 0.138 0.856
 $ zeroprop    : num [1:4, 1] -0.453 0.394 -0.848 0.871
 $ emitparm    : num [1:2, 1:4] 2.2089 3.1114 -0.0937 -0.1075 0.4362 ...
 $ working_parm: num [1:1000, 1:15] 0.406 0.411 0.405 0.4 0.399 ...