qle: Simulated quasi-likelihood parameter estimation

Description Usage Arguments Details Value Author(s) See Also Examples

View source: R/qsOpt.R

Description

This is the main function of the simulated quasi-likelihood estimation (QLE) approach.

Usage

1
2
3
4
5
qle(qsd, sim, ..., nsim, x0 = NULL, obs = NULL, Sigma = NULL,
  global.opts = list(), local.opts = list(), method = c("qscoring",
  "bobyqa", "direct"), qscore.opts = list(), control = list(),
  errType = "kv", multistart = FALSE, pl = 0, cl = NULL, iseed = NULL,
  plot = FALSE)

Arguments

qsd

object of class QLmodel

sim

simulation function, see details

...

further arguments passed to the simulation function 'sim'

nsim

optional, number of simulation replications at each new sample point, 'qsd$nsim' (default)

x0

optional, numeric vector of starting parameters

obs

optional, numeric vector of observed statistics, overwrites 'qsd$obs'

Sigma

optional, constant variance matrix estimate of statistics (see details)

global.opts

options for global search phase

local.opts

options for local search phase

method

vector of names of local search methods

qscore.opts

list of control arguments passed to qscoring

control

list of control arguments passed to any of the routines defined in 'method'

errType

type of prediction variances, choose one of "kv,cv,max" (see details)

multistart

logical, FALSE (default), whether to search for local minimia or roots from multiple starting points at global phase

pl

print level, use pl>0 to print intermediate results

cl

cluster object, NULL (default), of class "MPIcluster", "SOCKcluster", "cluster"

iseed

integer seed, NULL (default) for default seeding of the random number generator (RNG) stream for each worker in the cluster

plot

if TRUE, plot newly sampled points (for 2D-parameter estimation problems only)

Details

The function sequentially estimates the unknown model parameter. Basically, the user supplies a simulation function 'sim' which must return a vector of summary statistics (as the outcome of model simulations) and expects a vector of parameters as its first argument. Further arguments can be passed by the '...' argument. The object 'qsd' aggregates the type of variance matrix approximation, the data frame of observed and simulated data, the initial sample points and the covariance models of the involved statistics (see QLmodel). In addition, it defines the criterion function by 'qsd$criterion', which is either used to monitor the sampling process or minimized itself. The user also has the option to choose among different types of prediction variances: either "kv" (kriging variances), "cv" (CV variances) or the maximum of both, by "max", are available.

Criterion functions

The QD criterion function follows the quasi-likelihood estimation principle (see vignette) and seeks a solution to the quasi-score equation. Besides, the Mahalanobis distance (MD) as an alternative (simulation-based) criterion function has a more direct interpretation. It can be seen as a (weighted or generalized) least squares criterion depending on the employed type of variance matrix approximation. For this reason, we support several types of variance matrix approximations. In particular, given 'Sigma' and setting 'qsd$var.type' equal to "const" treats 'Sigma' as a constant estimate throughout the whole estimation procedure. Secondly, if 'Sigma' is supplied and used as an average variance approximation (see covarTx), it is considered an initial variance matrix approximation and recomputed each time an approximate (local) minimizer of the criterion function is found. This is commonly known as an iterative update strategy of variance matrices in the context of GMM estimation. Opposed to this, setting 'qsd$var.type' equal to "kriging" corresponds to continuously updating the variance matrix each time a new criterion function value is required at any point of the parameter space. In this way the algorithm can also be seen as a simulated version of a least squares method or even as a special case of a simulated method of moments (see, e.g. [3]). Note that some input combinations concerning the variance approximation types are not applicable since the criterion "qle", which uses the QD criterion function, does not support a constant variance matrix at all.

Monte Carlo (MC) hypothesis testing

The algorithm sequentially evaluates promising local minimizers of the criterion function during the local phase in order to assess the plausibility of being an approximate root of the corresponding quasi-score vector. We use essentially the same MC test procedure as in qleTest. First, having found a local minimum of the test statistic, i.e. the criterion function, given the data, new observations are simulated w.r.t. to the local minimizer and the algorithm re-estimates the approximate roots for each observation independently. If the current minimizer is accepted as an approximate root at the significance level 'local.opts$alpha', then the algorithm stays in its local phase and continues sampling around the current minimizer accoring to its asymptotic variance (measured by the inverse of the predicted quasi-information) and uses the additional simulations to improve the current kriging approximations. Otherwise we switch to the global phase and do not consider the current minimizer as an approximate root.

This procedure also allows for a stopping condition derived from the reults of the MC test. We can compare the estimated mean squared error (MSE) with the predicted error of the approximate root by its relative difference and terminate in case this value drops below a user-defined bound 'perr_tol' (either as a scalar value or numeric vector of length equal to the dimension of the unknown parameter). A value close to zero suggests a good match of both error measures. The testing procedure is disabled by default. Use 'local.opts$test=TRUE' for testing approximate roots. A value of the criterion function smaller than 'local.opts$ftol_abs' indicates that the corresponding minimizer could be an approximate root. Otherwise the last evaluation point is used as a starting point for next local searches which mimics a random multistart type minimization over the next iterations of the algorithm. This behaviour is also implemented for results of the above MC test when the local minimizer is not accepted as an approximate root. Note that this approach has the potential to escape regions where the criterion function value is quite low but, however, is not considered trustworthy in relation to the upper bound 'local.opts$ftol_abs' or the results of the MC test procedure.

If one of the other termination criteria is met in conjunction with a neglectable value of the criterion function, we say that the algorithm successfully terminated and converged to a local minimizer of the criterion function which could be an approximate root of the quasi-score vector. We then can perform a goodness-of-fit test in order to assess its plausibility (see qleTest) and quantify the empirical and predicted estimation error. If we wish to improve the final estimate the algorithm allows for a simple warm start strategy though not yet as an fully automated procedure. The algorithm can be easily restarted based on the final result of the preceeding run. We only need to extract the object 'OPT$qsd' as an input argument to function qle again.

Sampling new points

Our QLE approach dynamically switches from a local to a global search phase and vise versa for sampling new promising candidates for evaluation, that is, performing new simulations of the statistical model. Depending on the current value of the criterion function three different sampling criteria are used to select next evaluation points which aim on potentially improving the quasi-score or criterion function approximation. If a local minimizer of the criterion function has been accepted as an approximate root, then a local search tries to improve its accuracy. The next evaluation point is either selected according to a weighted minimum-distance criterion (see [2] and vignette), for the choice 'nextSample' equal to "score", or by maximizing the weighted variance of the quasi-score vector in case 'nextSample' is equal to "var". In all other cases, for example, if identifiable roots of the QS could not be found or the (numerical) convergence of the local solvers failed, the global phase of the algorithm is invoked and selects new potential candidates accross the whole search space based on a weighted selection criterion. This assigns large weights to candidates with low criterion function values and vise versa. During both phases the cycling between local and global candidates is controlled by the weights 'global.opts$weights' and 'locals.opts$weights', respectively. Besides this, the smaller the weights the more the candidates tend to be globally selected and vice versa during the global phase. Within the local phase, weights approaching one result in selecting candidates close to the current minimizer of the criterion function. Weights approximately zero maximize the minimum distance between candidates and previously sampled points and thus densely sample the search space almost everywhere if the algorithm is allowed to run infinitely. The choice of weights is somewhat ad hoc but may reflect the users preference on guiding the whole estimation more biased towards either a local or global search. In addition the local weights can be dynamically adjusted if 'useWeights' is FALSE depending on the current progress of estimation. In this case the first weight given by 'locals.opts$weights' is initially used for this kind of adjustment.

Some notes: For a 2D parameter estimation problem the function can visualize the sampling and selection process, which requires an active 2D graphical device in advance. The function can also be run in an cluster environment using the 'parallel' package. Make sure to export all functions to the cluster environment 'cl' beforehand, loading required packages on each cluster node, which are used in the simulation function (see clusterExport and clusterApply). If no cluster object is supplied, a local cluster is set up based on forking (under Linux) or as a socket connection for other OSs. One can also set an integer seed value 'iseed' to initialize each worker, see clusterSetRNGStream, for reproducible results of estimation in case a local cluster is used, i.e. cl=NULL and option mc.cores>1. If using a prespecified cluster object cl, then the user is responsible for seeding whereas the seed can be stored in the return value as well, see attribute 'optInfo$iseed'.

The following controls 'local.opts' for the local search phase are available:

The following controls 'global.opts' for the global search phase are available:

Value

List of the following objects:

par

final parameter estimate

value

value of criterion function

ctls

a data frame with values of stopping conditions

qsd

final QLmodel object, including all sample points and covariance models

cvm

CV fitted covariance models

why

names of stopping conditions matched

final

final local minimization results of the criterion function, see searchMinimizer

score

quasi-score vector or gradient of the Mahalanobis distance

convergence

logical, whether the iterates converged, see details

Attributes:

tracklist

an object (list) of class QDtrack containing the local minimization results, evaluated sample points and the status of the corresponding iteration

optInfo

a list of arguments related to the estimation procedure:

Author(s)

M. Baaske

See Also

mahalDist, quasiDeviance, qleTest

Examples

1
2
3
4
5
6
7
data(normal)
 
# main estimation with new evaluations
# (simulations of the statistical model)
OPT <- qle(qsd,qsd$simfn,nsim=10,
		    global.opts=list("maxeval"=1),
 		local.opts=list("test"=FALSE))

mbaaske/qle documentation built on May 27, 2019, midnight