Fit 'depmix' or 'mix' models

Share:

Description

fit optimizes parameters of depmix or mix models, optionally subject to general linear (in)equality constraints.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
	## S4 method for signature 'mix'
fit(object, fixed=NULL, equal=NULL, 
		conrows=NULL, conrows.upper=NULL, conrows.lower=NULL, 
		method=NULL, verbose=TRUE,
		emcontrol=em.control(),
		solnpcntrl=list(rho = 1, outer.iter = 400, inner.iter = 800, 
		delta = 1e-7, tol = 1e-8),
		donlpcntrl=donlp2Control(),
		...)
	
	## S4 method for signature 'mix.fitted'
summary(object,which="all")
	
	## S4 method for signature 'depmix.fitted'
summary(object,which="all")

Arguments

object

An object of class (dep-)mix.

fixed

Vector of mode logical indicating which parameters should be fixed.

equal

Vector indicating equality constraints; see Details.

conrows

Rows of a general linear constraint matrix; see Details.

conrows.upper, conrows.lower

Upper and lower bounds for the linear constraints; see Details.

method

The optimization method; mostly determined by constraints.

verbose

Should optimization information be displayed on screen?

emcontrol

Named list with control parameters for the EM algorithm (see em.control).

solnpcntrl

Control parameters passed to the 'rsolnp' optimizer; see solnp for explanation and defaults used there.

donlpcntrl

Control parameters passed to 'donlp' optimizer; see ?donlp2Control for explanation and defaults used there; this can be used to tweak optimization but note that extra output is not returned.

which

Should summaries be provided for "all" submodels? Options are "prior", "response", and for fitted depmix models also "transition".

...

Further arguments passed on to the optimization methods.

Details

Models are fitted by the EM algorithm if there are no constraints on the parameters. Aspects of the EM algorithm can be controlled through the emcontrol argument; see details in em.control. Otherwise the general optimizers solnp, the default (from package Rsolnp) or donlp2 (from package Rdonlp2) are used which handle general linear (in-)equality constraints. These optimizers are selected by setting method='rsolnp' or method='donlp' respectively.

Three types of constraints can be specified on the parameters: fixed, equality, and general linear (in-)equality constraints. Constraint vectors should be of length npar(object); note that this hence includes redundant parameters such as the base category parameter in multinomial logistic models which is always fixed at zero. See help on getpars and setpars about the ordering of parameters.

The equal argument is used to specify equality constraints: parameters that get the same integer number in this vector are estimated to be equal. Any integers can be used in this way except 0 and 1, which indicate fixed and free parameters, respectively.

Using solnp (or donlp2), a Newton-Raphson scheme is employed to estimate parameters subject to linear constraints by imposing:

bl <= A*x <= bu,

where x is the parameter vector, bl is a vector of lower bounds, bu is a vector of upper bounds, and A is the constraint matrix.

The conrows argument is used to specify rows of A directly, and the conrows.lower and conrows.upper arguments to specify the bounds on the constraints. conrows must be a matrix of npar(object) columns and one row for each constraint (a vector in the case of a single constraint). Examples of these three ways of constraining parameters are provided below.

Note that when specifying constraints that these should respect the fixed constraints inherent in e.g. the multinomial logit models for the initial and transition probabilities. For example, the baseline category coefficient in a multinomial logit model is fixed on zero.

llratio performs a log-likelihood ratio test on two fit'ted models; the first object should have the largest degrees of freedom (find out by using freepars).

Value

fit returns an object of class depmix.fitted which contains the original depmix object, and further has slots:

message:

Convergence information.

conMat:

The constraint matrix A, see Details.

posterior:

The posterior state sequence (computed with the viterbi algorithm), and the posterior probabilities (delta probabilities in Rabiner, 1989, notation).

The print method shows the message along with the likelihood and AIC and BIC; the summary method prints the parameter estimates.

Posterior densities and the viterbi state sequence can be accessed through posterior.

As fitted models are depmixS4 models, they can be used as starting values for new fits, for example with constraints added. Note that when refitting already fitted models, the constraints, if any, are not added automatically, they have to be added again.

Author(s)

Ingmar Visser & Maarten Speekenbrink

References

Some of the below models for the speed data are reported in:

Ingmar Visser, Maartje E. J. Raijmakers and Han L. J. van der Maas (2009). Hidden Markov Models for Invdividual Time Series. In: Jaan Valsiner, Peter C. M. Molenaar, M. C. D. P. Lyra, and N. Chaudhary (editors). Dynamic Process Methodology in the Social and Developmental Sciences, chapter 13, pages 269–289. New York: Springer.

Examples

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
data(speed)

# 2-state model on rt and corr from speed data set 
# with Pacc as covariate on the transition matrix
# ntimes is used to specify the lengths of 3 separate series
mod1 <- depmix(list(rt~1,corr~1),data=speed,transition=~Pacc,nstates=2,
	family=list(gaussian(),multinomial("identity")),ntimes=c(168,134,137))
# fit the model
set.seed(3)
fmod1 <- fit(mod1)
fmod1 # to see the logLik and optimization information
# to see the parameters
summary(fmod1)

# to see the posterior state sequence and associated delta probabilties
pst <- posterior(fmod1)

# testing viterbi states for new data
df <- data.frame(corr=c(1,0,1),rt=c(6.4,5.5,5.3),Pacc=c(0.6,0.1,0.1))
# define model with new data like above
modNew <- depmix(list(rt~1,corr~1),data=df,transition=~Pacc,nstates=2,
	family=list(gaussian(),multinomial("identity")))
# get parameters from estimated model
modNew <- setpars(modNew,getpars(fmod1))
# check the state sequence and probabilities
viterbi(modNew)

# same model, now with missing data
## Not run: 
speed[2,1] <- NA
speed[3,2] <- NA

# 2-state model on rt and corr from speed data set 
# with Pacc as covariate on the transition matrix
# ntimes is used to specify the lengths of 3 separate series
mod1ms <- depmix(list(rt~1,corr~1),data=speed,transition=~Pacc,nstates=2,
	family=list(gaussian(),multinomial("identity")),ntimes=c(168,134,137))
# fit the model
set.seed(3)
fmod1ms <- fit(mod1ms)

## End(Not run)

# instead of the normal likelihood, we can also maximise the "classification" likelihood
# this uses the maximum a posteriori state sequence to assign observations to states
# and to compute initial and transition probabilities. 

fmod1b <- fit(mod1,emcontrol=em.control(classification="hard"))
fmod1b # to see the logLik and optimization information

# FIX SOME PARAMETERS

# get the starting values of this model to the optimized 
# values of the previously fitted model to speed optimization

pars <- c(unlist(getpars(fmod1)))

# constrain the initial state probs to be 0 and 1 
# also constrain the guessing probs to be 0.5 and 0.5 
# (ie the probabilities of corr in state 1)
# change the ones that we want to constrain
pars[1]=0
pars[2]=1 # this means the process will always start in state 2
pars[13]=0.5
pars[14]=0.5 # the corr parameters in state 1 are now both 0, corresponding the 0.5 prob
mod2 <- setpars(mod1,pars)

# fix the parameters by setting: 
free <- c(0,0,rep(c(0,1),4),1,1,0,0,1,1,1,1)
# fit the model
fmod2 <- fit(mod2,fixed=!free)

# likelihood ratio insignificant, hence fmod2 better than fmod1
llratio(fmod1,fmod2)


# ADDING SOME GENERAL LINEAR CONSTRAINTS

# set the starting values of this model to the optimized 
# values of the previously fitted model to speed optimization

## Not run: 

pars <- c(unlist(getpars(fmod2)))
pars[4] <- pars[8] <- -4
pars[6] <- pars[10] <- 10
mod3 <- setpars(mod2,pars)

# start with fixed and free parameters
conpat <- c(0,0,rep(c(0,1),4),1,1,0,0,1,1,1,1)
# constrain the beta's on the transition parameters to be equal
conpat[4] <- conpat[8] <- 2
conpat[6] <- conpat[10] <- 3

fmod3 <- fit(mod3,equal=conpat)

llratio(fmod2,fmod3)

# above constraints can also be specified using the conrows argument as follows
conr <- matrix(0,2,18)
# parameters 4 and 8 have to be equal, otherwise stated, their diffence should be zero,
# and similarly for parameters 6 & 10
conr[1,4] <- 1
conr[1,8] <- -1
conr[2,6] <- 1
conr[2,10] <- -1

# note here that we use the fitted model fmod2 as that has appropriate 
# starting values
fmod3b <- fit(mod3,conrows=conr,fixed=!free) # using free defined above


## End(Not run)

data(balance)
# four binary items on the balance scale task
mod4 <- mix(list(d1~1,d2~1,d3~1,d4~1), data=balance, nstates=2,
	family=list(multinomial("identity"),multinomial("identity"),
	multinomial("identity"),multinomial("identity")))

set.seed(1)
fmod4 <- fit(mod4)

## Not run: 

# add age as covariate on class membership by using the prior argument
mod5 <- mix(list(d1~1,d2~1,d3~1,d4~1), data=balance, nstates=2,
	family=list(multinomial("identity"),multinomial("identity"),
	multinomial("identity"),multinomial("identity")),
	prior=~age, initdata=balance)

set.seed(1)
fmod5 <- fit(mod5)

# check the likelihood ratio; adding age significantly improves the goodness-of-fit
llratio(fmod5,fmod4)


## End(Not run)