Extracting the Likelihood Function and Changing Optimizer
In ctsmTMB: Continuous Time Stochastic Modelling using Template Model Builder

library(ctsmTMB)

In this document we show how to use the likelihood method to obtain function handlers for the objective function, and gradient, (and hessian if using a Kalman filter), for instance to use another optimization algorithm than stats::nlminb.

Simulate from the Ornstein-Uhlenbeck process

We use the common Ornstein-Uhlenbeck process to showcase the use of likelihood.

$$ \mathrm{d}X_{t} = \theta (\mu - X_{t}) \, \mathrm{d}t \, + \sigma_{X} \, \mathrm{d}B_{t} $$

$$ Y_{t_{k}} = X_{t_{k}} + e_{t_{k}}, \qquad e_{t_{k}} \sim \mathcal{N}\left(0,\sigma_{Y}^{2}\right)
$$ We first create data by simulating the process

# Simulate data using Euler Maruyama
set.seed(10)
theta=10; mu=1; sigma_x=1; sigma_y=1e-1
# 
dt.sim = 1e-3
t.sim = seq(0,1,by=dt.sim)
dw = rnorm(length(t.sim)-1,sd=sqrt(dt.sim))
x = 3
for(i in 1:(length(t.sim)-1)) {
  x[i+1] = x[i] + theta*(mu-x[i])*dt.sim + sigma_x*dw[i]
}

# Extract observations and add noise
dt.obs = 1e-2
t.obs = seq(0,1,by=dt.obs)
y = x[t.sim %in% t.obs] + sigma_y * rnorm(length(t.obs))

# Create data
.data = data.frame(
  t = t.obs,
  y = y
)

Construct model object

We now construct the ctsmTMB model object

# Create model object
obj = ctsmTMB$new()

# Add system equations
obj$addSystem(
  dx ~ theta * (mu-x) * dt + sigma_x*dw
)

# Add observation equations
obj$addObs(
  y ~ x
)

# Set observation equation variances
obj$setVariance(
  y ~ sigma_y^2
)

# Specify algebraic relations
obj$setAlgebraics(
  theta ~ exp(logtheta),
  sigma_x ~ exp(logsigma_x),
  sigma_y ~ exp(logsigma_y)
)

# Specify parameter initial values and lower/upper bounds in estimation
obj$setParameter(
  logtheta   = log(c(initial = 5,    lower = 0,    upper = 20)),
  mu         = c(    initial = 0,    lower = -10,  upper = 10),
  logsigma_x = log(c(initial = 1e-1, lower = 1e-5, upper = 5)),
  logsigma_y = log(c(initial = 1e-1, lower = 1e-5, upper = 5))
)

# Set initial state mean and covariance
obj$setInitialState(list(x[1], 1e-1*diag(1)))

Estimation

We are in principle ready to call the estimate method to run the optimization scheme using the built-in optimization which uses stats::nlminb i.e.

fit = obj$estimate(.data)

Inside the package we optimise the objective function with respect to the fixed parameters using the construction function handlers from TMB::MakeADFun and parsing them to stats::nlminb i.e.

nll = TMB::MakeADFun(...)
opt = stats::nlminb(start=nll$par, objective=nll$fn, grad=nll$gr, hessian=nll$he)

Extract function handlers

The likelihood method allows you to retrieve the nll object that holds the negative log-likelihood, and its derivatives. The method takes arguments similar to those of estimate.

nll = obj$likelihood(.data)

The initial parameters (supplied by the user) are stored here

nll$par

The objective function can be evaluated by

nll$fn(nll$par)

The gradient can be evaluated by

nll$gr(nll$par)

The hessian can be evaluated by

nll$he(nll$par)

We can now use these to optimize the function using e.g. stats::optim instead.

Extract parameter lower/upper bounds

You can extract the parameter bounds specified when calling setParameter() method by using the getParameters method (note that nll$par and pars$initial are identical).

pars = obj$getParameters()
print(pars)

Optimize manually using `stats::optim`

We supply the initial parameter values, objective function handler and gradient handler, and parameter bounds to optim.

opt = stats::optim(par=nll$par, 
                   fn=nll$fn, 
                   gr=nll$gr, 
                   method="L-BFGS-B", 
                   lower=pars$lower, 
                   upper=pars$upper)

Compare results between the two optimizers

Lets compare the results from using stats::optim with the extracted function handler versus the internal optimisation that uses stats::nlminb stored in fit:

# Estimated parameters
data.frame(external=opt$par, internal=fit$par.fixed)

# Neg. Log-Likelihood
data.frame(external=opt$value, internal=fit$nll)

# Gradient components
data.frame(external=t(nll$gr(opt$par)), internal=t(nll$gr(fit$par.fixed)))