nhm.control: Ancillary arguments for controlling nhm fits

View source: R/main.R

nhm.controlR Documentation

Ancillary arguments for controlling nhm fits

Description

This is used to set various logical or numeric parameters controlling a non-homogeneous Markov model fit. Usually to be used within a call to nhm.

Usage

nhm.control(tmax=NULL, coarsen=FALSE, coarsen.vars=NULL, coarsen.lv=NULL,
checks=FALSE,rtol=1e-6, atol=1e-6, fishscore=NULL, linesearch=FALSE, damped=FALSE,
damppar=0,obsinfo=TRUE,splits=NULL,safe=FALSE, ncores=1,parallel_hess=TRUE,
print.level=2, maxLikcontrol=NULL, nlminb_control=list(),constrained=FALSE,
lower_lim=-Inf, upper_lim=Inf,nlminb_scale=1)

Arguments

tmax

Optional parameter to set the maximum time to which the Kolmogorov Forward equations should be integrated. Defaults to 1+max(time) if left unspecified.

coarsen

If TRUE the covariate values will be subjected to coarsening using K-means clustering, so there are fewer unique values. This is useful for large datasets with continuous covariates.

coarsen.vars

Vector of the index of covariates which require coarsening. Must be supplied if coarsen=TRUE.

coarsen.lv

Number of unique covariate values to which the covariates should be coarsened.

checks

If TRUE some basic checks will be performed to ensure the accuracy of the supplied intens function. Mainly useful if a user defined type="bespoke" intensity function is used for which the default is TRUE, otherwise default is FALSE

rtol

Relative error tolerance to be passed to lsoda, default is 1e-6

atol

Absolute error tolerance to be passed to lsoda, default is 1e-6

fishscore

If TRUE then the Fisher scoring algorithm will be used provided the model has no censoring, exact death times or misclassification. This is generally faster, but less robust than the BHHH algorithm.

linesearch

If TRUE and fishscore=TRUE then a line search will be performed to find the best step length in the Fisher scoring algorithm.

damped

If TRUE the Fisher scoring algorithm will be damped (e.g. Levenberg type algorithm). Useful if some parameters are close to being unidentifiable.

damppar

Numerical damping parameter to be applied if damped=TRUE

obsinfo

If TRUE the observed Fisher information will be computed in addition to the expected information when the Fisher scoring algorithm is used. For optimization with maxLik the observed Fisher information will be used as the Hessian rather than the squared gradient vectors.

splits

Optional vector of intermediate split times for solving the ODEs. Only needed if P(0,t) becomes singular for some t causing the optimization to stop. Should be a set of consecutive values less than tmax.

safe

If TRUE will solve ODEs only from unique start times rather than by inverting.

ncores

Number of cores to use. 1= no parallelization, 2 or more: Uses mclapply when solving ODEs with different covariates patterns.

parallel_hess

If TRUE then applies parallelization using ncores (at an overall functional evaluation level) to find the final Hessian by finite differences.

print.level

For maxLik optimization; level of detail to print. Integer from 0 to 3. Defaults to 2.

maxLikcontrol

For maxLik optimization; optional list of control parameters to be passed to maxLik.

nlminb_control

For nlminb optimization; optional list of control parameters to be passed to nlminb.

constrained

If TRUE then box-constrained optimization using nlminb will be used rather than BHHH or Fisher scoring.

lower_lim

Lower limits for box-constrained. Should either be a scalar or a numeric vector of length equal to number of unknown parameter.

upper_lim

Upper limits for box-constrained. Should either be a scalar or a numeric vector of length equal to number of unknown parameter.

nlminb_scale

Numeric value to be used as the scale argument in nlminb.

Details

tmax, rtol and atol refer directly to parameters with the lsoda function in deSolve and relate to how the Kolmogorov Forward Equations are numerically solved.

coarsen, coarsen.vars and coarsen.lv are useful in situations where it is computationally infeasible (or unattractive) to compute the exact solution for all covariate patterns. Implements an approximate solution in which the covariates are coarsened using K-means clustering (as proposed in Titman (2011)).

linesearch, damped, damppar are specific to the Fisher scoring algorithm.

Setting obsinfo=TRUE will tend to give more accurate standard error estimates and gives more opportunity to check for non-convergence of the maximum likelihood procedure.

The option splits modifies the way in which the transition probabilities are computed. By default, nhm solves a single system of differential equations starting from 0 to obtain P(0,t) and then uses inversion of the Chapman-Kolmogorov equation P(0,t) =P(0,t_0)P(t_0,t) to find P(t_0,t) for a given t_0 > 0. In some cases P(0,t_0) will be singular or effectively singular. If a split is specified at s then nhm will find P(t_0,t) for t_0 > t* by solving the system of equations P(t*,t) where t* is the smallest interval start time greater than or equal to s within the data. If nhm fails due to the lack of split times, the error message will advise on the interval in which the split should be introduced.

Note that the need for splits can also arise if the initial parameters specified are inappropriate, or for models where the likelihood is quite flat in some directions. Hence it will usually be better to either find more appropriate initial parameter estimates (for instance by fitting the analogous homogeneous model in msm) or to use constrained=TRUE and set lower and upper bounds for the parameter values, than set many split values. An option safe=TRUE can also be chosen. This avoids using any inversion of P(t_0,t) to find transition probabilities but will come at the cost of large increase in computation time.

ncores allows parallel processing to be used, through the parallel package, to simultaneously solve the systems of differential equations for each covariate pattern. If ncores > 1 then ncores defines the mc.cores value in mclapply. Note that the data needs to include multiple covariate patterns for this to successfully increase computation speed. parallel_hess specifies whether the parallelization should also apply to the computation of the final Hessian to compute the observed Fisher information (used if obsinfo=TRUE and either constrained=TRUE or fishscore=TRUE). Generally, this should be more efficient since each overall function evaluation should take approximately the same time. However, for large datasets and large numbers of cores it may cause memory issues. Setting parallel_hess=FALSE when ncores>1 means that the parallelization will instead apply within each function evaluation at the ODE solver stage.

Value

A list containing the values of each of the above constants.

Author(s)

Andrew Titman a.titman@lancaster.ac.uk

References

Titman AC. Flexible Nonhomogeneous Markov Models for Panel Observed Data. Biometrics, 2011. 67, 780-787.

See Also

nhm


nhm documentation built on Sept. 1, 2025, 1:08 a.m.

Related to nhm.control in nhm...