knitr::opts_chunk$set(cache = 2, cache.lazy = FALSE, tidy = FALSE, warning = FALSE)
library(knitcitations)
set.seed(round(runif(1, 1, 1e4)))

Introduction

The DiscrimOD package is designed for generating optimal designs for model discrimination. For the cases of two competing models, we consider the $T$- and $KL$-optimal criteria which are based on evaluating the distance between two models. For the cases of more than two competing models, we work on the max-min approach that maximizes the worst efficiency over all pairs of competing models in the model class.

We hybridize two types of optimization algorithms to tackle the maximin problem in this distance-measure-based design criterion. In this supplementary, we provide the complete instruction in how to use our DiscrimOD package. We also revisit several examples in the literature and show our algorithms work succcessfully.

The remaining parts are organized as follows. In this section, we will briefly review the $T$- and $KL$-optimal criteria and introduce our proposed algorithms. Section 2 is the complete user guide for implementing the DiscrimOD package. Section 3 consists of 3 examples from the literature. We show how to use DiscrimOD to find the optimal discrimination designs under various setps for the competing models. Section 4 shows a real application in toxicology and find the max-min $KL$-optimal design when the models have non-normal errors. Section 5 extends an example in @atkinson1975design and compare the newly proposed algorithm with the NestedPSO in @chen2015minimax. We show the new algorithm is more efficient in tackling the optimal discrimination design search problems.

Discrimination Designs

Throughout this article, the probability distribution function of the response variable, $y$, is represented by $f(y \mid x, \theta, \sigma^2)$ with mean function $\eta(x, \theta)$, where $x$ is the independent variable in a pre-specified experimental region ${X}$, $\theta$ is a unknown parameter vector and $\sigma^2$ is nuisance parameter.
Usually $\sigma^2$ is used to represent the variance of the assumed probability distribution. Suppose there are several possible models with different underlying probability distributions, (f_1(y\mid x,\theta_1,\sigma^2), \ldots, f_K(y\mid x,\theta_K,\sigma^2)), where $\theta_i\in\Theta_i\subseteq R^{m_i}$, for some positive integers $m_i$. The goal is to find the optimal design to discriminate among these competing models.

We consider the approximate design. An approximate design is a measure on trails over the experimental region ${X}$ which has the form $$ \xi = \left{s_1, s_2, \ldots, s_n; p_1, p_2, \ldots, p_n\right} $${#eq:approxDesign}

where $s_1, s_2, \ldots, s_n$ are $n$ distinct support points in ${X}$, and, $p_i$ is corresponding measure at the $i$th support point satisfying $\sum_ip_i=1$. In practice, suppose the total budget is $N$ runs. Then based on the design $\xi$, roughly $p_iN$ observations are taken at each support point of the design $\xi$. For this type of experimental design, usually we can appraise the optimality of a design by using the equivalence theorem.

Discriminating Between Two Models {#twomodel}

There are several types of distance-measure-based discrimination design criteria for different underlying distribution assumptions. The first criterion is the $T$-optimal criterion proposed by @atkinson1975design which is based on homoscedastic Gaussian assumption. Afterwards, many variants of this criterion were proposed to tackle other model assumptions. For example, @ucinski2004heteroscedastic extended it to the heteroscedastic Gaussian assumption. @carlos1993optimum considered the non-normal case and proposed a criterion for discriminating two logistic regression models. The most general criterion so far was introduced by @lopez2007optimal. They used $KL$-divergence to measure the difference between two models with wider assumptions on the underlying distribution. They also showed their criterion is equivalent to many published discrimination criteria such as $T$-optimal criterion and those in @ucinski2004heteroscedastic and @carlos1993optimum. The criterion in @lopez2007optimal is called the $KL$-optimal criterion, and in this section, we focus on the introduction to their approach.

Let $f_1(y\mid x,\theta_1,\sigma^2)$ and $f_2(y\mid x,\theta_2,\sigma^2)$ be the two competing models. Assume the true model, $f_t(y\mid x, \sigma^2)$, is $f_1(y\mid x,\theta_1,\sigma^2)$ with specified parameter $\theta_1$. To measure the difference between two competing models, the Kullback--Leibler ($KL$) divergence ([@eq:kldiv]) is used, i.e.,

$$ \mathcal{I}(f_t, f_2, x, \theta_2) = \int f_t(y\mid x,\sigma^2)\log{\left{\frac{f_t(y\mid x,\sigma^2)}{f_2(y\mid x,\theta_2,\sigma^2)}\right}}~dy, ~\forall~x\in\mathcal{X}. $${#eq:kldiv}

The $KL$-optimal criterion is the minimal value of $\mathcal{I}(f_t, f_2, x, \theta_2)$ over $\theta_2\in\Theta_2$, i.e.

$$ I_{2,t}(\xi) = \underset{\theta_2\in\Theta_2}{\min}\left{ \int_\mathcal{X} \mathcal{I}(f_t, f_2, x, \theta_2)~\xi(dx) \right}, $${#eq:klCri}

and the design $\xi^*{KL}$ is $KL$-optimal if it maximizes $I{2,t}(\xi)$.

According to @lopez2007optimal, the equivalence theorem tells, the design $\xi^{KL}$ is the $KL$-optimal design if and only if the directional derivative, $\psi{KL}(x,\xi^_{KL})$,

$$ \psi_{KL}\left(x,\xi^_{KL}\right) = \mathcal{I}(f_t, f_2, x, \hat{\theta}_2(\xi^{KL})) - I{2,t}(\xi^*_{KL}) \leq 0 $${#eq:klDisp}

for all $x \in \mathcal{X}$, where $\hat{\theta}2(\xi)$ minimizes the $KL$ divergence in ([@eq:kldiv]) at $\xi^_{KL}$. The equality of ([@eq:klDisp]) holds at each support point of $\xi^{KL}$. It is easy to show that the $T$-optimal criterion is a special case of the $KL$-optimal criterion when homoscedasticity model and normal error assumptions are held.

Discriminating Between More Than Two Models {#moremodel}

When the number of the competing models is more than 2, say $f_1, f_2, \ldots, f_K$, $K>2$, @tommasi2007optimal and @tommasi2016max studied the corresponding discriminate design problem based on the relative design efficiencies. With out loss of generality, we fix the first model as the true model, $f_t = f_1$, and brief their approach in the following. Consider a two-step procedure. First, we need to identify the $KL$-optimal designs, $\xi^_{KL,i}$, $i = 2, \ldots, K$, for discriminating each pair of the $i$th rival model and true model $f_t$ separately.
Given a design $\xi$, define the $KL$-efficiency of $\xi$ with respect to $\xi^
_{KL,i}$ as

$$ \mbox{Eff}i(\xi) = \frac{I{i,t}(\xi)}{I_{i,t}(\xi^*_{KL,i})}~, i = 2, \ldots, K, $${#eq:klEff}

where $I_{i,t}(\xi)$ can be found in ([@eq:klEff]). The idea of finding the optimal discrimination designs for more than 2 competing models is to maximize all the $KL$-efficiencies as possible as one can. Thus finding the optimal discriminate designs can be transferred as a multiple objective optimization problem.

To accomplish this goal, @tommasi2007optimal considered a pre-specified weight vector, $\alpha=(\alpha_2,\ldots,\alpha_K)$ satisfying $0\leq\alpha_i\leq1$ and $\sum_{i=2}^K \alpha_i = 1$, and proposed the generalized $KL$-optimal design that maximizes the weighted sum of the $KL$-efficiencies based on $\alpha$, i.e. $\sum_i \alpha_i \mbox{Eff}_i(\xi)$. The $i$th component in $\alpha$ represents the importance of the $i$th rival models of user's interest. On the other hand, due to the lack of prior knowledge to $\alpha$, @tommasi2016max considered the worst case of the $KL$-efficiencies. Their idea is to maximize the minimal $KL$-efficiency among $\mbox{Eff}_i(\xi)$, $i = 2,\ldots,K$, that is to maximize

$$ I_m(\xi) = \underset{2\leq i\leq K}{\min}\mbox{Eff}_i(\xi). $${#eq:mmklCri}

The design that maximizes ([@eq:mmklCri]) is called the max-min $KL$-optimal design.

Consider the equivalence theorem of the max-min $KL$-optimal design, $\xi^_{mmKL}$. Define a subset, $\mathcal{C}\left(\xi^_{mmKL}\right)$, that consists of the indices of closest rival models to the true model such that

$$ \mbox{Eff}i(\xi^_{mmKL}) < \mbox{Eff}_j(\xi^{mmKL}), ~i\in\mathcal{C}\left(\xi^_{mmKL}\right), j\notin\mathcal{C}\left(\xi^_{mmKL}\right). $${#eq:mmModelSet}

@tommasi2016max showed that there exists a weight vector in $[0,1]^{K-1}$, $\tilde\alpha = (\tilde\alpha_2, \ldots, \tilde\alpha_K)$ satisfying

$$ \sum_{i=2}^K \tilde\alpha_i = 1~\mbox{ and }~ \tilde\alpha_i = 0~\mbox{ if }~i\notin\mathcal{C}\left(\xi^*_{mmKL}\right) $${#eq:mmwt}

such that $\xi^*_{mmKL}$ is the max-min $KL$-optimal design if and only if it is also the special case of generalized $KL$-optimal design through $\tilde\alpha$, and hence, adopting the equivalence theorem for generalized $KL$-optimal design, we have

$$ \psi_{mmKL}\left(x,\xi^{mmKL}\right) = \sum{i=2}^K \tilde\alpha_i~\frac{\mathcal{I}\left(f_t,f_i,x,\hat{\theta}_i\left(\xi^{mmKL}\right)\right) } {I{i,t}\left(\xi^_{KL,i}\right)} - I_m(\xi^_{mmKL}) \leq 0 $${#eq:mmklDisp}

for all $x \in \mathcal{X}$, where

$$ \hat{\theta}i\left(\xi\right) = \underset{\theta_i\in\Theta_i}{\arg\min} \int\mathcal{X} \mathcal{I}(f_t,f_i,x,\theta_i)~\xi(dx), $$

and the equality holds at all support points of $\xi^*_{mmKL}$.

Algorithms {#alg}

PSO-QN Algorithm

For the cases of two competing models, We propose to combine the particle swarm optimization [PSO, @eberhart1995new] and the L-BFGS algorithm to tackle the maximin problem in ([@eq:klCri]). The PSO algorithm focuses on search for the optimal design on the design space. The criterion value of a design itslef is a minization problem, and because of its differentiability, the L-BFGS algorithm can be used. We call this proposed algorithm by the PSO-QN algorithm.

The PSO algorithm starts with initializing a flock of particles and each particle is a design, $\xi$, and we represent it as a vector, $(s_1, \ldots, s_n, p_1, \ldots, p_{n-1})^\top$. Let $\xi_i^{(t)}$ be $i$th particle at the $t$th iteration. Define $\xi_{i^}^{(t-1)}$ to be the design with the maximal design criterion value discovered by the $i$th particle before the $t$th iteration, and $\xi_g^{(t-1)}$ to be the design of the best value found by the whole swarm before the $t$th iteration. Let $V_i^{(t)}$ be the velocity of the $i$th particle at the $t$th iteration, PSO updates its position by $$ V_i^{(t)} = \omega^{(t)} V_i^{(t-1)} + c_1 R_1 \otimes \left[\xi_{i^}^{(t-1)} - \xi_i^{(t-1)}\right] + c_2 R_2 \otimes \left[\xi_g^{(t-1)} - \xi_i^{(t-1)}\right],
$${#eq:psov} $$ \xi_i^{(t)} = \xi_i^{(t-1)} + V_i^{(t)}~\mbox{ for }~i=1,\ldots,N. $${#eq:psou} The notation $\otimes$ indicates the componentwise product. The PSO parameters in ([@eq:psov]), such as $c_1$, $c_2$ and $\omega^{(t)}$ , are illustrated in the Appendix A.

The following table show the updating procedure of the PSO algorithm and our proposed PSO-QN algorithm follows the same procedure except that the calculation for design criterion values in Step (2.3) is accomplished by the L-BFGS algorithm.


1: Define the design criterion function $\Phi(\xi, \theta)$, e.g. ([@eq:klCri]), and initialize particles - (1.1) Choose initial particles (designs) $\xi_i^{(0)}$ and velocities $V_i^{(0)}$, $i = 1, \ldots, N$. - (1.2) Calculate design criterion values $\Phi(\xi_i^{(0)}, \theta)$ for each $i$. - (1.3) Determine the local and global best designs, $\xi_{i^}^{(0)}$ and $\xi_g^{(0)}$. 2: At the $t$th iteration, do - (2.1) Calculate particles' velocities $V_i^{(t)}$ by ([@eq:psov]). - (2.2) Update particles $\xi_i^{(t)}$ by ([@eq:psou]) - (2.3) Calculate design criterion values $\Phi(\xi_i^{(t)}, \theta)$. - (2.4) Update the local and global best designs $\xi_{i^}^{(t)}$ and $\xi_g^{(t)}$. 3: Output the final global best design $\xi_g$ and $\Phi(\xi_g, \theta)$.


: The PSO algorithm for optimal design search problems. {#tbl:pso}

PSO-S-QN Algorithm

According the max-min $KL$-optimal criterion, we need to identify the $KL$-optimal designs, $\xi^*_{KL, i}$, for the $i$th model and the pre-specified true model, and then we can transfer the max-min $KL$ optimal design search problem as a nested optimization problem with three layers. To deal with max-min $KL$-optimal criterion, we can slightly modify the PSO-QN algorithm by extending it as a three layer optimization algorithm. Instead of directly using L-BFGS algorithm to obtain the fitness value, the Step (2.3) is modified as following two steps.

(2.3a) Use L-BFGS algorithm to compute $\mbox{Eff}_i(\xi)$ with respect to the parameter $\theta_i$ in ([@eq:klEff]) for each $i$.

(2.3b) Identify the minimal value of $\mbox{Eff}_i(\xi)$ as the fitness value.

Implementation {#pkg}

This section includes the comprehensive guideline to use our DiscrimOD package. We start with the installation for this current alpha test version which has not been uploaded on CRAN yet. Then we show how to configure the parameters of PSO-QN and PSO-S-QN algorithms. To use the DiscrimOD package, the most important step is to properly specify the information on the target discrimination design problem. We will show how to input the design criterion and the competing models in R. Finally, there 3 main functions in the DiscrimOD package which are designed to implement the design search algorithm, to compute for the equivalence theorem and to calculate design criterion for a specified design.

Installation {#pkginstall}

The current status of DiscrimOD package is in the process of alpha test, i.e. it is open for development team only. Before install DiscrimOD package, one need to install R software with version 3.4.0 and then the devtool package. Essentially, the DiscrimOD package depends on a few R packages, Rcpp and RcppArmadillo. Please make sure to install them as well. Next, use install_github command in devtool to install DiscrimOD package from Ping-Yang Chen's github repository. The R codes are shown below.

install.packages(c("devtools", "Rcpp", "RcppArmadillo"))
devtools::install_github("PingYangChen/DiscrimOD")

After installation, we use library to load the DiscrimOD package.

library(DiscrimOD)

Algorithm Parameters {#pkgparam}

The optimal discrimination design search algorithm in the DiscrimOD package involves two types of algorithms, PSO and L-BFGS. The PSO settings are defined through getPSOInfo() function. For complete information including the default values on the PSO settings, please refer to Appendix A. Here, we list the most influential tuning parameters of PSO below:

The settings for the L-BFGS algorithm are specified through getLBFGSInfo() function. The complete information on the L-BFGS setting can be found in Appendix B.

The codes shown below are the algorithm settings used in this vignette. First, for PSO-QN algorithm working on discrimination design for two models, we set 32 particles and 200 iterations for PSO, and for each computation for the inner loop, we repeat L-BFGS algorithm twice. The remaining settings in both algorithms are set as default values.

# PSO basic settings
PSO_INFO <- getPSOInfo(nSwarm = 32, maxIter = 200)
# L-BFGS basic settings
LBFGS_INFO <- getLBFGSInfo(LBFGS_RETRY = 2)

Second, for more than 2 competing models, we use the PSO-S-QN algorithm to search for the max-min type design. This type of optimal design problem is considered to be harder. Therefore we may use more particles and iterations than the PSO-QN algorithm. Here, we set 32 particles and 400 iterations and use LBFGS_INFO as the settings for L-BFGS in the PSO-S-QN algorithm.

PSO_INFO_MAXMIN <- getPSOInfo(nSwarm = 32, maxIter = 400)

In DiscrimOD package, we can replace L-BFGS algorithm by PSO for tackling the inner optimization problem in ([@eq:klCri]). By doing so, we find the optimal discrimination design by the NestedPSO algorithm in @chen2015minimax. To setup for the NestedPSO algorithm, we specify for each parameter in the getPSOInfo() function with a vector of length 2. The first component is the parameter for the outer PSO loop and the second one is set for the inner PSO loop. For example, the codes below indicates that, we set 32 particles and 200 iterations for the outer PSO loop, and, 32 particles and 400 iterations for the inner PSO loop. Most importantly, we need to turn the L-BFGS algorithm off by specifying IF_INNER_LBFGS = FALSE in the getLBFGSInfo() function.

# Set NestedPSO options. The length of setting indicates the number of loops
NESTEDPSO_INFO <- getPSOInfo(nSwarm = c(32, 32), maxIter = c(200, 100))
# Turn off L-BFGS implementation for the inner optimization loop
LBFGS_NOTRUN <- getLBFGSInfo(IF_INNER_LBFGS = FALSE)

Distance Measure Function {#pkgintro1}

Base on the considered optimal design criterion, we need to define the distance measure function in R. The distance function has two inputs. The first input must be the mean vector of the true model and the second input must be the mean vector of the rival model. The output is the vector of distance measurements with the same length of the inputs.

# xt is the vector of mean values of the true model at each support point
# xr is the vector of mean values of the rival model at each support point
DISTANCE <- function(xt, xr) 
  "write the R codes for the distance measure between 'xt' and 'xr'"

For example, the distance measure in the $T$-optimal design criterion is the L2-distance, $[\eta_t(x) - \eta_r(x, \theta_r)]^2$, and hence, we define sq_diff <- function(xt, xr) (xt - xr)^2 for it. For heteroscedastic Gaussian case [@ucinski2004heteroscedastic], the distance measure is \begin{equation} \frac{V_t(x) + [\eta_t(x) - \eta_r(x, \theta_r)]^2}{V_r(x, \theta_r)} - \log\left(\frac{V_t(x)}{V_r(x, \theta_r)}\right) \end{equation} where $V_t(x)$ and $V_r(x, \theta_r)$ are the variances of the normal distributions for true and rival models, respectively. Suppose the variance is proportional to the squared mean, i.e. $V_j = \eta_j^2$, we define the R function as below.

# heteroscedastic Gaussian case
heter_norm <- function(xt, xr) {
  var_t <- xt^2; var_r <- xr^2
  (var_t + (xt - xr)^2)/var_r - log(var_t/var_r)
}

In the Appendix C, we provide the codes for several distance measure functions in @lopez2007optimal.

Information on Competing Models {#pkgintro2}

This section is the key in the DiscrimOD package.
An optimal discrimination design problem involves the forms of mean functions of the competing models, the nominal values for the true model and the parameter spaces for the rival models. To specify all these information for our package, we need to properly create an R list with the required structure.

The guideline to setup the R list can be found by calling the emptyModelList() function. The model list consists of several sub-lists and its length must be equal to the number of competing models.

emptyModelList(N_model = 2)

In the first sub-list, we input the information of the true model with two components, model for storing the user-defined R function and para is the vector of nominal values.

Starting from the second sub-list, we input the information of the rival models and each sub-list is constructed in the same way. The sub-lists for the rival models has 3 components and the first one model is the user-define R function. The second and third components are paraLower and paraUpper, respectively. They are the lower and upper bounds of the parameter space of the parameters of the rival model. The length of the bounds must be equal to the number of parameters in the rival model.

Later on, we will demonstrate the setup with various real examples. Please see those examples in Sections 3 to 5.

Main Functions {#pkgintro3}

When the model list and the distance function are well-prepared, we now can implement our main function DiscrimOD to search for the optimal discrimination design.
There are a few more vital inputs.
Input the number of support points for nSupp which should be larger than or equal to 2. Input the vectors of lower and upper bounds of the design space in dsLower and dsUpper, respectively. The lengths of dsLower and dsUpper must be the same and equal to the dimension of the design space.

The type of design criterion is specified through crit_type. Currently, the DiscrimOD package can search for $T$- and $KL$-optimal designs (crit_type = "pair_fixed_true") when there are two competing models, and, max-min $T$- and $KL$-optimal designs (crit_type = "maxmin_fixed_true") when there are more than two competing models.

For finding the max-min $T$- and $KL$-optimal designs, we need to know those $T$- and $KL$-optimal designs for pairwise discrimination in advance. Then, input their optimal criterion values as a vector in MaxMinStdVals.

DiscrimOD(MODEL_INFO, DISTANCE, nSupp, dsLower, dsUpper, 
          crit_type = "pair_fixed_true", MaxMinStdVals = NULL,
          PSO_INFO = NULL, LBFGS_INFO = NULL, seed = NULL, verbose = TRUE)

equivalence(ngrid = 100, DESIGN = NULL, PSO_RESULT = NULL, MODEL_INFO, DISTANCE, 
            dsLower, dsUpper, 
            crit_type = "pair_fixed_true", MaxMinStdVals = NULL, 
            PSO_INFO = NULL, LBFGS_INFO = NULL, ALPHA_PSO_INFO = NULL)

designCriterion(DESIGN1, MODEL_INFO, DISTANCE, dsLower, dsUpper, 
                crit_type = "pair_fixed_true", MaxMinStdVals = NULL,
                PSO_INFO = NULL, LBFGS_INFO = NULL)

Selected Examples in the Literature {#examples}

In this section, to show that our DiscrimOD package can identify the $T$-, $KL$-optimal and max-min optimal designs, we revisit some examples in the literature. We select 4 papers. @atkinson1975design and @atkinson1975optimal worked on the $T$-optimal design for linear models. @lopez2007optimal is the first to proposed the $KL$-optimal criterion to construct the discrimination design for non-norm competing models. They found the $KL$-optimal design for two nonlinear competing models under different error assumptions. @tommasi2016max proposed the max-min $KL$-optimal criterion for discriminating more than two competing models. They illustrated their approach by an example with four logistic regression models.

Atkison and Fedorov's Example {#af}

In @atkinson1975design and @atkinson1975optimal, there are three linear and homoscedastic models with mean responses given by \begin{align} \eta_1(x,\theta_1) & = \theta_{10} + \theta_{11}e^x + \theta_{12}e^{-x}, \ \eta_2(x,\theta_2) & = \theta_{20} + \theta_{21}x + \theta_{22}x^2, \ \eta_3(x,\theta_3) & = \theta_{30} + \theta_{31}\sin{\left(\frac{\pi x}{2}\right)} + \theta_{32}\cos{\left(\frac{\pi x}{2}\right)} + \theta_{33}\sin{\left(\pi x\right)}. \end{align} In addition to the normal error assumption, they also set the first model, $\eta_1(x,\theta_1)$, as the true model with parameters $\theta_1=(\theta_{10}, \theta_{11}, \theta_{12}) = (4.5, -1.5, -2)$. The design space is $x\in[-1,1]$.

We illustrate how to use the DiscrimOD package to identify the $T$-optiaml design and show our result is simmilar to those in @atkinson1975design. For discriminating all three models, @atkinson1975optimal used a sequency procedure to find a design that balances the $T$-optimal criterion values between two cases of pairwise discrimination. However, they approach does not guarantee that the result is $T$-optimal. Here, we find a new result of optimal design based on the max-min $T$-optimal design criterion which is different from the work in @atkinson1975optimal.

To use the DiscrimOD package to find the optimal design for discriminating these 3 models, we first write the R function objects, af1 to af3, for $\eta_1$ to $\eta_3$, respectively.

af1 <- function(x, p) p[1] + p[2]*exp(x) + p[3]*exp(-x)
af2 <- function(x, p) p[1] + p[2]*x + p[3]*x^2
af3 <- function(x, p) p[1] + p[2]*sin(0.5*pi*x) + p[3]*cos(0.5*pi*x) + p[4]*sin(pi*x)

Before we search for the max-min $T$-optimal design, we need to identify the $T$-optimal designs for pairwise discrimination between the true model $\eta_1$ and rival models $\eta_2$, and $\eta_3$, respectively. Thus, we define two model lists for each discrimination case. It is worth to noting that, the parameter space of the rival model parameters must be a finite space. One can set it with wide ranges for each parameter or based on the prior knowledge on where the minimal distance would occur. The codes are shown below.

# Set the nominal values for the true model
AF_para_af1 <- c(4.5, -1.5, -2.0)
# The first pair: \eta_1 vs \eta_2
af_info_12 <- list(
  # The first list should be the true model and the specified nominal values
  list(model = af1, para = AF_para_af1),
  # Then the rival models are listed accordingly. We also need to specify the model space
  list(model = af2, paraLower = rep(-10, 3), paraUpper = rep(10, 3))
)
# The second pair: \eta_1 vs \eta_3
af_info_13 <- list(
  list(model = af1, para = AF_para_af1),
  list(model = af3, paraLower = rep(-10, 4), paraUpper = rep(10, 4))
)

For $T$-optimal designs, we specify the distance measure, squared difference between two means, in R.

# xt is the mean values of the true model
# xr is the mean values of the rival model
sq_diff <- function(xt, xr) (xt - xr)^2

Next, we implement DiscrimOD function and call the PSO-QN algorithm to identify the $T$-optimal designs for both cases of pairwise discrimination. Use the first case, $\eta_1$ versus $\eta_2$, to illustrate the details. In addition to the model list, distance function and the algorithm settings, we need to input the type of discrimination criterion for pairwise discrimination by crit_type = "pair_fixed_true", the number of support points (here, nSupp = 4) and the lower and upper bound of the design space, dsLower = -1 and dsUpper = 1, respecitvely.

# Run PSO-QN Algorithm for discriminating \eta_1 and \eta_2
af_res_12 <- DiscrimOD(MODEL_INFO = af_info_12, DISTANCE = sq_diff,
                       nSupp = 4, dsLower = -1, dsUpper = 1,
                       crit_type = "pair_fixed_true",
                       PSO_INFO = PSO_INFO, LBFGS_INFO = LBFGS_INFO, 
                       seed = NULL, verbose = FALSE)

The output of DiscrimOD consists of the best design and its criterion value. It also reports the search path of the global best particle and computing time.

names(af_res_12)

The best design found by the PSO-QN algorithm is stored in the tag $BESTDESIGN.

round(af_res_12$BESTDESIGN, 3) 

The resulting design is a matrix with nSupp rows and 2 columns. The first column consists of 4 support points and the second is the weight vector for each support point. In this case, the best design is $${-1.000, -0.669, 0.144, 0.957; 0.253, 0.428, 0.247, 0.072}.$$

The design criterion value of the best design can be found in the $BESTVAL tag. In this case, the value is 0.001087 after rounding.

af_res_12$BESTVAL 

The computing time in seconds is in the $CPUTIME tag.

af_res_12$CPUTIME 

To verify whether the resulting design is optimal or not, we use the equivalence theorem. In the DiscrimOD package, we implement the equivalence function. In this function, we need to input the numerical result object, af_res_12, obtain by the DiscrimOD function. Based on the numerical result, the equivalence function computes the values of directional derivative function on the grid of length ngrid. One can plot the curve of the directional derivative function by R utility function plot. The x-axis is the grid vector of design space which can be found in the $Grid_1 tag. The y-axis is the values of directional derivative function in $DirDeriv tag. We can also pin the support points on the curve by points function. To make a better visualization, we use the horizontal line at $y=0$ to show the resulting design is $T$-optimal, that is, the values of directional derivative function are smaller than zero, and at the support points, the values should be zero (at least close to zero).

af_eqv_12 <- equivalence(ngrid = 100, PSO_RESULT = af_res_12, 
                         MODEL_INFO = af_info_12, DISTANCE = sq_diff, 
                         dsLower = -1, dsUpper = 1,
                         crit_type = "pair_fixed_true",
                         PSO_INFO = PSO_INFO, LBFGS_INFO = LBFGS_INFO)
# Draw the directional derivative curve
plot(af_eqv_12$Grid_1, af_eqv_12$DirDeriv, type = "l", col = "blue", 
     main = "af_res_12", xlab = "x", ylab = "Directional Derivative"); abline(h = 0)
points(af_res_12$BESTDESIGN[,1], rep(0, nrow(af_res_12$BESTDESIGN)), pch = 16)

For the second case of pairwise discrimination, $\eta_1$ versus $\eta_3$, the implementation of the PSO-QN algorithm is similar.
We input the model list for this case, af_info_13, which has been created earlier. In this case, we are searching for the 5-point design.

# Run PSO-QN Algorithm for discriminating \eta_1 and \eta_3
af_res_13 <- DiscrimOD(MODEL_INFO = af_info_13, DISTANCE = sq_diff, 
                       nSupp = 5, dsLower = -1.0, dsUpper = 1.0, 
                       crit_type = "pair_fixed_true",
                       PSO_INFO = PSO_INFO, LBFGS_INFO = LBFGS_INFO, 
                       seed = NULL, verbose = FALSE)

The value of design criterion is 0.005715 after rounding and the resulting design is

round(af_res_13$BESTDESIGN, 3)

By using the equivalence function and plot the directional derivative function, the equivalence theorem shows the resulting design for the second case is $T$-optimal.

af_eqv_13 <- equivalence(ngrid = 100, PSO_RESULT = af_res_13, 
                         MODEL_INFO = af_info_13, DISTANCE = sq_diff, 
                         dsLower = -1, dsUpper = 1, 
                         crit_type = "pair_fixed_true",
                         PSO_INFO = PSO_INFO, LBFGS_INFO = LBFGS_INFO)
# Draw the directional derivative curve
plot(af_eqv_13$Grid_1, af_eqv_13$DirDeriv, type = "l", col = "blue", 
     main = "af_res_13", xlab = "x", ylab = "Directional Derivative"); abline(h = 0)
points(af_res_13$BESTDESIGN[,1], rep(0, nrow(af_res_13$BESTDESIGN)), pch = 16)

Now, we concentrate on finding the max-min $T$-optimal design for competing models $\eta_1$, $\eta_2$ and $\eta_3$. Since we have already set the lists of information for these three models, we can directly include them from the previously defined lists to form a new list for the max-min approach.

An important input for the max-min approach is those previously identified criterion values. One must be careful with the inputing order of the criterion value vector. Since we set the order of the rival models in the list by $\eta_2$ and then $\eta_3$, we need to have the same order for the pairwise discrimination results, that is, af_res_12$BESTVAL and then af_res_13$BESTVAL.

# Create model list for max-min approach
af_info_maxmin <- list(af_info_12[[1]], af_info_12[[2]], af_info_13[[2]])
# Define the vector of optimal criterion values for efficiency computations
af_vals_pair <- c(af_res_12$BESTVAL, af_res_13$BESTVAL)

To run the PSO-S-QN algorithm to search for the max-min optimal discrimination design, we input crit_type = "maxmin_fixed_true" and the model list that consists of all three models and necessary parameter information.
We also need to specify the vector of $T$-optimal criterion values in MaxMinStdVals for the algorithm to compute the $T$-efficiencies.

af_res_maxmin <- DiscrimOD(MODEL_INFO = af_info_maxmin, DISTANCE = sq_diff, 
                           nSupp = 5, dsLower = -1, dsUpper = 1, 
                           crit_type = "maxmin_fixed_true", 
                           MaxMinStdVals = af_vals_pair,
                           PSO_INFO = PSO_INFO_MAXMIN, LBFGS_INFO = LBFGS_INFO, 
                           seed = NULL, verbose = FALSE)

The resulting max-min design is $${-1.000, -0.704, -0.018, 0.570, 1.000; 0.227, 0.382, 0.216, 0.114, 0.061}.$$

round(af_res_maxmin$BESTDESIGN, 3)

The criterion value is the minimum among the two $T$-efficiencies.
Here, the resulting minimal $T$-efficiency is 0.8062 after rounding.

af_res_maxmin$BESTVAL

To check whether the resulting design is max-min optimal or not, we use the equivalence theorem. In the equivalence function, we input the numerical result object by PSO_RESULT = af_res_maxmin, and the vector of $T$-optimal criterion values, MaxMinStdVals = af_vals_pair.

af_eqv_maxmin <- equivalence(ngrid = 100, PSO_RESULT = af_res_maxmin, 
                             MODEL_INFO = af_info_maxmin, DISTANCE = sq_diff, 
                             dsLower = -1, dsUpper = 1, 
                             crit_type = "maxmin_fixed_true", 
                             MaxMinStdVals = af_vals_pair, 
                             PSO_INFO = PSO_INFO_MAXMIN, LBFGS_INFO = LBFGS_INFO)
# Draw the directional derivative curve
plot(af_eqv_maxmin$Grid_1, af_eqv_maxmin$DirDeriv, type = "l", col = "blue", 
     main = "af_res_maxmin", xlab = "x", ylab = "Directional Derivative"); abline(h = 0)
points(af_res_maxmin$BESTDESIGN[,1], rep(0, nrow(af_res_maxmin$BESTDESIGN)), pch = 16)

For max-min optimal design, the equivalence function first finds the $\alpha$ vector in ([@eq:mmwt]) by an ordinary PSO algorithm and outputs the $\alpha$ vector in $alpha tag. In this case, $\alpha=(0.683, 0.317)$.

# The weight of efficiency values
af_eqv_maxmin$alpha

Then it computes the directional derivative function values so that we can draw the equivalence plot with plot and points functions. In the plot, the equivalence theorem shows the resulting design is max-min $T$-optimal.

Two Types of Michaelis-Menten Models {#mm}

In this section we show how to find the $KL$-optimal designs for different error assumptions by our DiscrimOD package with the examples in @lopez2007optimal. Two pharmacokinetic models, Michaelis-Menten (MM) model and modified Michaelis-Menten (MMM) model, are used to describe the velocity of chemical reaction whose mean functions are \begin{align} \mbox{MM:}\; & \eta_1(x, \theta_1) = \frac{V_1x}{K_1 + x}, \theta_1 = (V_1, K_1), \ \mbox{MMM:}\; & \eta_2(x, \theta_2) = \frac{V_2x}{K_2 + x} + F_2x, \theta_2 = (V_2, K_2, F_2). \end{align} The variable $x$ is the substrate concentration in an experimental range $\mathcal{X}=[0.1, 5]$. For $j=1,2$, the parameters $V_j > 0, j = 1, 2$ are the reaction rate at maximal concentration level, and $K_j >0, j =1, 2$ are called Michaelis-Menten constant which represent the concentration at which half of the maximum velocity rate is reached. The MMM models is a generalization of MM model by adding an additional linear term with coefficient $F_2$.

Consider two error assumptions, log-normal distributed and gamma distributed errors. We briefly introduce the formula of the $KL$ divergence for each assumption. Let the continuous response variable, $Y_j$, has mean $\eta_j(x, \theta_j)$ and unknown variance $\nu_j^2(x, \theta_j)$, and we fix $\eta_2(x, \theta_2)$ to be the true model. First, suppose the response variable, $Y_j$, follows the log-normal distribution.
The $KL$ divergence between two log-normal models is $$ \mathcal{I}_{LN}(f_2, f_1, x, \theta_1) = \log{\left{\frac{\sigma_1^2}{\sigma_2^2}\right}} - \frac{\sigma_1^2 - \sigma_2^2 - \left(\mu_1^2 - \mu_2^2\right)^2}{2\sigma_1^2}, $${#eq:kllognorm} where $\mu_j$ and $\sigma_j^2$ are the mean and variance of $\log{(Y_j)}$, respectively, that are \begin{align} \mu_j & = \mu_j(x, \theta_j) = \log{\left{\eta_j(x,\theta_j) \big/ \sqrt{1+\nu_j^2(x, \theta_j)\eta_j(x, \theta_j)^{-2}}\right}}, \ \sigma_j^2 & = \sigma_j^2(x, \theta_j) = \log{\left{1+\nu_j^2(x, \theta_j)\eta_j(x, \theta_j)^{-2}\right}}. \end{align}

Second, if $Y_j$ is gamma distributed, the $KL$ divergence between two gamma models is $$ \mathcal{I}_{G}(f_2, f_1, x, \theta_1) = \log{\left(\frac{\beta_2^{\delta_2} / \delta_2}{\beta_1^{\delta_1} / \delta_1}\right)} + (\delta_2 - \delta_1)\left[\frac{\Gamma'(\delta_2)}{\Gamma(\delta_2)}-\log{(\beta_2)}\right] - \frac{\delta_2(\beta_2 - \beta_1)}{\beta_2} $${#eq:klgamma} where $\Gamma(\cdot)$ denotes the Gamma function, and, $\delta_j$ and $\beta_j$ are shape and scale parameters of the gamma distribution, respectively, which have the forms $$ \delta_j = \delta_j(x,\theta_j) = \eta^2_j(x, \theta_j)/\nu^2_j(x,\theta_j) ~ \mbox{ and } ~ \beta_j = \beta_j(x,\theta_j) = \eta_j(x, \theta_j)/\nu^2_j(x,\theta_j). $$

For log-normal and gamma distributed data, the common assumption of the nuisance parameters $\nu_1^2$ and $\nu_2^2$ is that, the responses have constant coefficient of variation [@mccullagh1989generalized]. This means that $\nu_j(x, \theta_j) / \eta_j(x, \theta_j)=C$, where $C$ is a constant.
Thus, for two log-normal responses, $Y_1$ and $Y_2$, sharing a common and constant coefficient of variation, we have $\sigma_1^2 = \sigma_2^2 = \sigma^2$. For two gamma responses, having common and constant coefficient of variation implies that $\delta_1 = \delta_2 = \delta$ and $\beta_j = \eta_j(x, \theta_j)^{-1}$. For more detail of the formula derivation, please refer to @lopez2007optimal. Here, we assume $\sigma^2 = 1$ and $\delta = 1$.

In @lopez2007optimal, MMM model, $\eta_2(x, \theta_2)$, is treated as the true model with nominal values $\theta_2 = (V_2, K_2, F_2) = (1, 1, 1)$. To use our DiscrimOD package to find the $KL$-optimal designs, we first define the list object for the competing models.

# Two types of Michaelise-Menten models
enzyme2 <- function(x, p) p[1]*x/(p[2] + x) + p[3]*x  # Modified Michaelis-Menten
enzyme1 <- function(x, p) p[1]*x/(p[2] + x)           # Michaelise-Menten
# Specify the nominal value for the true model, 'linlogi4'.
para_mmm_2 <- c(1, 1, 1)
# Create model list
model_mmm <- list(
  list(model = enzyme2, para = para_mmm_2),
  list(model = enzyme1, paraLower = c(1e-4, 1e-4), paraUpper = c(20, 20))
)

Then we define the $KL$ divergence under log-nromal and gamma assumptions.

# Log-normal model
log_norm_B <- function(xt, xr) {
  sigsq <- 1 # nuisance parameter
  var_t <- (exp(sigsq) - 1.0)*(xt^2) # variance (true mdoel)
  var_r <- (exp(sigsq) - 1.0)*(xr^2) # variance (rival model)
  mu_t <- log(xt) - 0.5*log(1.0 + (var_t/(xt^2))) # mean (true model)
  mu_r <- log(xr) - 0.5*log(1.0 + (var_r/(xr^2))) # mean (rival model)
  ((mu_r - mu_t)^2)/(2*sigsq) # KL-divergence
}
# Gamma model
gamma_diff <- function(xt, xr) log(xr/xt) + (xt - xr)/xr # KL-divergence

Similar to the previous section, we use the DiscrimOD function to identify the $KL$-optimal design. For the log-normal case, we input the distance function by DISTANCE = log_norm_B. Other settings has been introduced in Section 3.1.

# Run for finding KL-optimal design under lognormal assumption
mmm_res_lognB <- DiscrimOD(MODEL_INFO = model_mmm, DISTANCE = log_norm_B, 
                           nSupp = 3, dsLower = 0.1, dsUpper = 5, 
                           crit_type = "pair_fixed_true", 
                           PSO_INFO = PSO_INFO, LBFGS_INFO = LBFGS_INFO, 
                           seed = NULL, verbose = FALSE)

The algorithm shows the resulting $KL$-optimal design under log-normal assumption is

round(mmm_res_lognB$BESTDESIGN, 3) 

For the gamma case, we simply change the input for the distance function by DISTANCE = gamma_diff.

# Run for finding KL-optimal design under lognormal assumption
mmm_res_gamma <- DiscrimOD(MODEL_INFO = model_mmm, DISTANCE = gamma_diff, 
                           nSupp = 3, dsLower = 0.1, dsUpper = 5, 
                           crit_type = "pair_fixed_true", 
                           PSO_INFO = PSO_INFO, LBFGS_INFO = LBFGS_INFO, 
                           seed = NULL, verbose = FALSE)

The resulting $KL$-optimal design under gamma assumption is

round(mmm_res_gamma$BESTDESIGN, 3) 

We use the equivalence function to compute the directional derivative function values for both cases. The plots for the function curves are shown below which indicates the two resulting designs are $KL$-optimal.

# Check equivalence theorem for KL-optimal design under lognormal error assumption
mmm_eqv_lognB <- equivalence(ngrid = 100, PSO_RESULT = mmm_res_lognB, 
                             MODEL_INFO = model_mmm, DISTANCE = log_norm_B, 
                             dsLower = 0.1, dsUpper = 5,  
                             crit_type = "pair_fixed_true", 
                             PSO_INFO = PSO_INFO, LBFGS_INFO = LBFGS_INFO)
# Check equivalence theorem for KL-optimal design under gamma error assumption
mmm_eqv_gamma <- equivalence(ngrid = 100, PSO_RESULT = mmm_res_gamma, 
                             MODEL_INFO = model_mmm, DISTANCE = gamma_diff, 
                             dsLower = 0.1, dsUpper = 5,  
                             crit_type = "pair_fixed_true", 
                             PSO_INFO = PSO_INFO, LBFGS_INFO = LBFGS_INFO)
# Draw the directional derivative curve
lay <- layout(matrix(1:2, 1, 2))
plot(mmm_eqv_lognB$Grid_1, mmm_eqv_lognB$DirDeriv, type = "l", col = "blue", 
     main = "mmm_res_lognB", xlab = "x", ylab = "Directional Derivative"); abline(h = 0)
points(mmm_res_lognB$BESTDESIGN[,1], rep(0, nrow(mmm_res_lognB$BESTDESIGN)), pch = 16)
plot(mmm_eqv_gamma$Grid_1, mmm_eqv_gamma$DirDeriv, type = "l", col = "blue", 
     main = "mmm_res_gamma", xlab = "x", ylab = "Directional Derivative"); abline(h = 0)
points(mmm_res_gamma$BESTDESIGN[,1], rep(0, nrow(mmm_res_gamma$BESTDESIGN)), pch = 16)

Four Logistic Regression Models {#logit}

We show our PSO-S-QN algorithm can find the max-min $KL$-optimal design for four linear competing models. In @tommasi2016max, there are four logistic models with different regression structures, \begin{align} \eta_1(x,\theta_1) & = \theta_{10}x, & & \theta_1=\theta_{10},\ \eta_2(x,\theta_2) & = \theta_{20}+\theta_{21}x, & & \theta_2=(\theta_{20},\theta_{21}), \ \eta_3(x,\theta_3) & = \theta_{30}x+\theta_{31}x^2, & & \theta_3=(\theta_{30},\theta_{31}),\ \eta_4(x,\theta_4) & = \theta_{40}+\theta_{41}x+\theta_{42}x^2, & & \theta_4=(\theta_{40},\theta_{41},\theta_{42}). \end{align} The true model is chosen to be $\eta_t(x) = \eta_4(x, \theta_4)$ with $\theta_4 = (1,1,1)$. According to @carlos1993optimum and @lopez2007optimal, the $KL$ divergence for the $i$th rival logistic regression models, $i=1,2,3$, is $$ \mathcal{I}_{Logit}(\eta_4, \eta_i, x, \theta_i) = \pi_4\log{\left(\frac{\pi_4}{\pi_i}\right)} + (1-\pi_4)\log{\left(\frac{1 - \pi_4}{1 - \pi_i}\right)}, $${#eq:kllogit} where $$\pi_j = \pi_j(x, \theta_j) = \frac{\exp{{\eta_j(x, \theta_j)}}}{1 + \exp{{\eta_j(x, \theta_j)}}}, j = 1,2,3,4.$$

To find the max-min $KL$-optimal design, we firstly use PSO-QN to find $KL$-optimal designs for discriminating the true model $\eta_t = \eta_4$ and $\eta_i, i = 1, 2, 3$.

# Tommasi et al. (2016)
linlogi4 <- function(x, p) p[1] + p[2]*x + p[3]*(x^2)  # Logistic 4
linlogi3 <- function(x, p) x*(p[1] + p[2]*x)           # Logistic 3
linlogi2 <- function(x, p) p[1] + p[2]*x               # Logistic 2
linlogi1 <- function(x, p) p[1]*x                      # Logistic 1
# Specify the nominal value for the true model, 'linlogi4'.
para_logit_4 <- c(1, 1, 1)
# Create model list
model_logit <- list(
  list(model = linlogi4, para = para_logit_4),
  list(model = linlogi3, paraLower = c(-10, -10), paraUpper = c(10, 10)),
  list(model = linlogi2, paraLower = c(-10, -10), paraUpper = c(10, 10)),
  list(model = linlogi1, paraLower = -10, paraUpper = 10)
)
logit_diff <- function(xt, xr) {
  exp_t <- exp(xt)
  exp_r <- exp(xr)
  mu_t <- exp_t/(1 + exp_t)
  mu_r <- exp_r/(1 + exp_r)
  # Return KL-divergence
  mu_t*(log(mu_t) - log(mu_r)) + (1 - mu_t)*(log(1 - mu_t) - log(1 - mu_r))
}
# Cases of pairwise discrimination
two_model_logit <- list(
  list(model_logit[[1]], model_logit[[2]]),
  list(model_logit[[1]], model_logit[[3]]),
  list(model_logit[[1]], model_logit[[4]])
)
# Number of support points 
two_logit_nSupp <- c(3, 3, 3)
# Run for each case
logit_res_pair <- vector("list", 3)
for (CaseID in 1:3) {
  res <- DiscrimOD(MODEL_INFO = two_model_logit[[CaseID]], DISTANCE = logit_diff, 
                   nSupp = two_logit_nSupp[CaseID], dsLower = 0, dsUpper = 1, 
                   crit_type = "pair_fixed_true", 
                   PSO_INFO = PSO_INFO, LBFGS_INFO = LBFGS_INFO,
                   seed = NULL, verbose = FALSE) 

  eqv <- equivalence(ngrid = 100, PSO_RESULT = res, 
                     MODEL_INFO = two_model_logit[[CaseID]], DISTANCE = logit_diff,
                     dsLower = 0, dsUpper = 1, 
                     crit_type = "pair_fixed_true", 
                     PSO_INFO = PSO_INFO, LBFGS_INFO = LBFGS_INFO)

  logit_res_pair[[CaseID]] <- list(res = res, eqv = eqv)
}

Basically the numerical results of $KL$-optimal designs not shown here are similar to @tommasi2016max. Then we use PSO-S-QN algorithm to find the max-min $KL$-optimal design, which we denote by $\xi^*_{mmKL}$ and show the result below.

# Get criterion values from pairwise discrimination results
logit_vals_pair <- sapply(1:length(logit_res_pair), function(i) {
  logit_res_pair[[i]]$res$BESTVAL
})
# Run for finding max-min KL-optimal design 
logit_res_mxmn <- DiscrimOD(MODEL_INFO = model_logit, DISTANCE = logit_diff, 
                            nSupp = 3, dsLower = 0, dsUpper = 1, 
                            crit_type = "maxmin_fixed_true", 
                            MaxMinStdVals = logit_vals_pair,
                            PSO_INFO = PSO_INFO_MAXMIN, LBFGS_INFO = LBFGS_INFO, 
                            seed = NULL, verbose = FALSE)

We also show the design, $\xi_{TML2016}$, found by @tommasi2016max and find that both results are similar.

$${0.000, 0.360, 1.000; 0.618, 0.239, 0.142}.$$

round(logit_res_mxmn$BESTDESIGN, 3) 

To check whether the resulting design is max-min optimal or not, we use the equivalence theorem. In the equivalence function, we input the numerical result PSO_RESULT = logit_res_mxmn and the vector of $KL$-optimal criterion values, MaxMinStdVals = logit_vals_pair.

logit_eqv_mxmn <- equivalence(ngrid = 100, PSO_RESULT = logit_res_mxmn, 
                              MODEL_INFO = model_logit, DISTANCE = logit_diff, 
                              dsLower = 0, dsUpper = 1,  
                              crit_type = "maxmin_fixed_true", 
                              MaxMinStdVals = logit_vals_pair, 
                              PSO_INFO = PSO_INFO_MAXMIN, LBFGS_INFO = LBFGS_INFO)
# Draw the directional derivative curve
plot(logit_eqv_mxmn$Grid_1, logit_eqv_mxmn$DirDeriv, type = "l", col = "blue", 
     main = "logit_res_mxmn", xlab = "x", ylab = "Directional Derivative"); abline(h = 0)
points(logit_res_mxmn$BESTDESIGN[,1], rep(0, nrow(logit_res_mxmn$BESTDESIGN)), pch = 16)

For max-min optimal design, the equivalence function first finds the $\alpha$ vector in ([@eq:mmwt]) by an ordinary PSO algorithm and outputs the $\alpha$ vector in $alpha tag. In this case, $\alpha=(0.591, 0.409, 0.000)$.

# The weight of efficiency values
logit_eqv_mxmn$alpha

We draw the curve of directional derivative function by plot function. In the plot, the equivalence theorem tells the resulting design is max-min $KL$-optimal.

Five Dose-Response Models in Toxicology {#tox}

In this section, we show how to use the PSO-QN and PSO-S-QN algorithms to find the optimal discrimination design in a real application. In @piersma2000developmental, they proposed a class of 5 dose-response models which are widely used in modelling continuous endpoint in toxicology. These models with the different mean responses are given below.

\begin{align} \eta_1(x,\theta_1) & = a, && \quad\theta_1 = a > 0, \ \eta_2(x,\theta_2) & = a e^{-x/b}, && \quad\theta_2 = (a,b)^\top, a > 0, b > 0, \ \eta_3(x,\theta_3) & = a e^{-(x/b)^d}, && \quad\theta_3 = (a,b,d)^\top, a > 0, b > 0, d\geq1, \ \eta_4(x,\theta_4) & = a \left(c - (c-1)e^{-x/b}\right), && \quad\theta_4 = (a,b,c)^\top, a > 0, b > 0, c\in[0,1], \ \eta_5(x,\theta_5) & = a \left(c - (c-1)e^{-(x/b)^d}\right), && \quad\theta_5 = (a,b,c,d)^\top, a > 0, b > 0, c\in[0,1], d\geq1. \end{align}

In particular, @piersma2000developmental were interested in how the exposure to butyl benzyl phthalate (BBP) in maternal animals during gestation affects the fetal weights. The experimental region is $x\in[0, 1250]$. In @slob2002dose, they used the dataset in @piersma2000developmental and illustrated the model selection procedure for this model class. They concluded that $\eta_5$ is the most accurate one that describes the dataset and reported the estimated parameters which are $\hat\theta_5=(\hat{a},\hat{b},\hat{c},\hat{d}) = (4.282$, $835.571$, $0.739$, $3.515)$. Assume the largest model \@ref(eq:m5) is the true model with nominal values in $\hat\theta_5$. In this section, we consider all errors are assumed to be log-normally distributed and the goal is to search for the optimal discrimination designs for all 5 models, $\eta_1$ to $\eta_5$.

Here, we provide different coding style from Section 3 for users in organizing the model list to achieve the same goal. we first create the list consisting of 5 models which will be used in finding the max-min optimal discrimination design. The first sub-list is the true model $\eta_5$ with nominal values $\hat\theta_5$. Starting from the second sub-list, we input the information for models $\eta_4$, $\eta_3$, $\eta_2$ and $\eta_1$ in the order. Based on many times of trial-and-error, for each rival model, we can roughly identify the parameter space which includes the location of the minimal distance between two models.

tox5_par <- c(4.282, 835.571, 0.739, 3.515)
tox_info <- list(
  # tox_5
  list(model = function(x, p) p[1]*(p[3] - (p[3] - 1)*exp(-(x/p[2])^p[4])),
       para = tox5_par),
  # tox_4
  list(model = function(x, p) p[1]*(p[3] - (p[3] - 1)*exp(-(x/p[2]))),
       paraLower = c(0, 0, 0),
       paraUpper = c(20, 5000, 1)),
  # tox_3
  list(model = function(x, p) p[1]*exp(-(x/p[2])^p[3]),
       paraLower = c(0, 0, 1),
       paraUpper = c(20, 5000, 15)),
  # tox_2
  list(model = function(x, p) p[1]*exp(-(x/p[2])),
       paraLower = c(0, 0),
       paraUpper = c(20, 5000)),
  # tox_1
  list(model = function(x, p) rep(p[1], length(x)),
       paraLower = c(0),
       paraUpper = c(20))
)

According to @lopez2007optimal, the $KL$-divergence between two log-normal models is

\begin{equation}\label{eq:kllognorm} \mathcal{I}_{LN}(f_2, f_1, x, \theta_1) = \log{\left{\frac{\sigma_1^2}{\sigma_2^2}\right}} - \frac{\sigma_1^2 - \sigma_2^2 - \left(\mu_1^2 - \mu_2^2\right)^2}{2\sigma_1^2}, \end{equation}

where $\mu_j$ and $\sigma_j^2$ are the mean and variance of $\log{(Y_j)}$, respectively, that are

\begin{align} \mu_j & = \mu_j(x, \theta_j) = \log{\left{\eta_j(x,\theta_j) \big/ \sqrt{1+\nu_j^2(x, \theta_j)\eta_j(x, \theta_j)^{-2}}\right}}, \ \sigma_j^2 & = \sigma_j^2(x, \theta_j) = \log{\left{1+\nu_j^2(x, \theta_j)\eta_j(x, \theta_j)^{-2}\right}}. \end{align}

The parameters $\nu_1^2$ and $\nu_2^2$ are considered as nuisance and @lopez2007optimal applied the constant coefficient of variation assumption [@mccullagh1989generalized]. That is, $\nu_j(x, \theta_j) / \eta_j(x, \theta_j) = constant$, and it implies that $\sigma_1^2 = \sigma_2^2 = \sigma^2$ which we assume to be 0.01. The R code for this distance measure is shown below.

log_norm_B <- function(xt, xr) {
  sigsq <- 0.01 # nuisance parameter
  var_t <- (exp(sigsq) - 1.0)*(xt^2) # variance (true mdoel)
  var_r <- (exp(sigsq) - 1.0)*(xr^2) # variance (rival model)
  mu_t <- log(xt) - 0.5*log(1.0 + (var_t/(xt^2))) # mean (true model)
  mu_r <- log(xr) - 0.5*log(1.0 + (var_r/(xr^2))) # mean (rival model)
  ((mu_r - mu_t)^2)/(2*sigsq) # KL-divergence
}

Before searching for the max-min discrimination design, we need to identify the $KL$-optimal designs for 4 cases of pairwise discrimination. Thus, to implement the DiscrimOD package, we first re-organize the full model list tox_info into 4 smaller model lists, each has $\eta_5$ and one rival model. The numbers of support points for the 4 cases are 3, 4, 3, and 2. Then we implement DiscrimOD function to find the $KL$-optimal designs and use the equivalence function to verify the optimalities of the resulting designs.

# Cases of pairwise discrimination
two_tox_info <- list(
  list(tox_info[[1]], tox_info[[2]]), # tox_5 vs tox_4
  list(tox_info[[1]], tox_info[[3]]), # tox_5 vs tox_3
  list(tox_info[[1]], tox_info[[4]]), # tox_5 vs tox_2
  list(tox_info[[1]], tox_info[[5]])  # tox_5 vs tox_1 
)
# Number of support points 
two_tox_nSupp <- c(3, 4, 3, 2)
# Run for each case
tox_res_pair <- vector("list", 4)
for (CaseID in 1:4) {
  res <- DiscrimOD(MODEL_INFO = two_tox_info[[CaseID]], DISTANCE = log_norm_B, 
                   nSupp = two_tox_nSupp[CaseID], dsLower = 0, dsUpper = 1250, 
                   crit_type = "pair_fixed_true", 
                   PSO_INFO = PSO_INFO, LBFGS_INFO = LBFGS_INFO,
                   seed = NULL, verbose = FALSE) 

  eqv <- equivalence(ngrid = 100, PSO_RESULT = res, 
                     MODEL_INFO = two_tox_info[[CaseID]], DISTANCE = log_norm_B,
                     dsLower = 0, dsUpper = 1250, 
                     crit_type = "pair_fixed_true", 
                     PSO_INFO = PSO_INFO, LBFGS_INFO = LBFGS_INFO)

  tox_res_pair[[CaseID]] <- list(res = res, eqv = eqv)
}

The resulting design for each case can be found in tox_res_pair$res$BESTDESIGN and we omit to show them here. We directly draw the directional derivative function plots for each case. In some curves, the points might be visually greater than zero, but the maximal value of the directional derivative function is still smaller than $10^{-4}$. This suggests that the PSO-QN algorithm finds the $KL$-optimal designs for all cases.

lay <- layout(matrix(1:4, 2, 2, byrow = TRUE))
for (i in 1:4) {
  tmp <- tox_res_pair[[i]] 
  plot(tmp$eqv$Grid_1, tmp$eqv$DirDeriv, type = "l", col = "blue", 
       main = paste0("Tox 5 vs. Tox", 5-i),
       xlab = "x", ylab = "Directional Derivative"); abline(h = 0)
  points(tmp$res$BESTDESIGN[,1], rep(0, nrow(tmp$res$BESTDESIGN)), pch = 16)
}

Finally, we search for the max-min $KL$-optimal design by the PSO-S-QN algorithm. From the collection of results for pairwise discrimination, we construct the vector for each criterion value in tox_vals_pair.

# Get criterion values from pairwise discrimination results
tox_vals_pair <- sapply(1:length(tox_res_pair), function(i) tox_res_pair[[i]]$res$BESTVAL)
# Run for finding max-min KL-optimal design 
tox_res_maxmin <- DiscrimOD(MODEL_INFO = tox_info, DISTANCE = log_norm_B, 
                            nSupp = 4, dsLower = 0, dsUpper = 1250, 
                            crit_type = "maxmin_fixed_true", 
                            MaxMinStdVals = tox_vals_pair,
                            PSO_INFO = PSO_INFO_MAXMIN, LBFGS_INFO = LBFGS_INFO, 
                            seed = NULL, verbose = FALSE)

The resulting max-min discrimination design is $${0.000, 450.437, 1042.138, 1250.000; 0.224, 0.342, 0.244, 0.189}.$$

round(tox_res_maxmin$BESTDESIGN, 3) 

The minimal $KL$-efficiency among 4 efficiencies is 0.7679 after rounding.

tox_res_maxmin$BESTVAL 

To check whether the resulting design is max-min optimal or not, we use the equivalence theorem. In the equivalence function, we input the numerical result PSO_RESULT = tox_res_maxmin and the vector of $T$-optimal criterion values, MaxMinStdVals = tox_vals_pair.

tox_eqv_maxmin <- equivalence(ngrid = 100, PSO_RESULT = tox_res_maxmin, 
                              MODEL_INFO = tox_info, DISTANCE = log_norm_B, 
                              dsLower = 0, dsUpper = 1250,  
                              crit_type = "maxmin_fixed_true", 
                              MaxMinStdVals = tox_vals_pair, 
                              PSO_INFO = PSO_INFO_MAXMIN, LBFGS_INFO = LBFGS_INFO)
# Draw the directional derivative curve
plot(tox_eqv_maxmin$Grid_1, tox_eqv_maxmin$DirDeriv, type = "l", col = "blue", 
     main = "tox_res_maxmin", xlab = "x", ylab = "Directional Derivative"); abline(h = 0)
points(tox_res_maxmin$BESTDESIGN[,1], rep(0, nrow(tox_res_maxmin$BESTDESIGN)), pch = 16)
# The weight of efficiency values
tox_eqv_maxmin$alpha

We draw the curve of directional derivative function by plot function. In the plot, the equivalence theorem tells the resulting design is max-min $KL$-optimal.

PSO-QN versus NestedPSO {#vs}

In this section, we show how to change the inner optimization loop in ([@eq:klCri]) from L-BFGS to PSO, i.e. implement the NestedPSO algorithm for finding the optimal discrimination design. Also, we compare the performance of our PSO-QN algorithm with the NestedPSO algorithm in the example below.

We follow @atkinson1975design and extend their example with more parameters. Consider two non-linear models with mean responses given by \begin{align} \eta_1(x,\theta_1) & = \theta_{10} + \theta_{11}e^{\theta_{12} x} + \theta_{13}e^{-\theta_{14}x}, \label{eq:afex1} \ \eta_2(x,\theta_2) & = \theta_{20} + \theta_{21}x + \theta_{22}x^2 + \theta_{23}x^3, \label{eq:afex2} \ \end{align} We assume normal errors for both models and set the first model, $\eta_1(x,\theta_1)$, as the true model with parameters $\theta_1=(\theta_{10}, \theta_{11}, \theta_{12}, \theta_{13}, \theta_{14}) = (4.5, -1.5, 0.5, -2, 0.5)$. The design space is $x\in[-1,1]$.

To use the DiscrimOD package to find the $T$-optimal design for discriminating these 2 models, we first write the R functions, afex1 and afex2 for $\eta_1$ and $\eta_2$, respectively.

# models
afex1 <- function(x, p) p[1] + p[2]*exp(p[3]*x) + p[4]*exp(-p[5]*x)
afex2 <- function(x, p) p[1] + p[2]*x + p[3]*(x^2) + p[4]*(x^3)
# Nominal values
afex1_para <- c(4.5, -1.5, 0.5, -2.0, 0.5)
# model list
afex_list <- list(
  list(model = afex1, para = afex1_para),
  list(model = afex2, paraLower = rep(-10, 4), paraUpper = rep(10, 4))
)

To run the PSO-QN algorithm, the settings for the DiscrimOD has been introduced in Sections 3 and 4. The codes are shown below.

afex_res <- DiscrimOD(afex_list, sq_diff, nSupp = 5, dsLower = -1, dsUpper = 1,
                      crit_type = "pair_fixed_true", MaxMinStdVals = NULL,
                      PSO_INFO = PSO_INFO, LBFGS_INFO = LBFGS_INFO,
                      seed = NULL, verbose = FALSE)

The best design found by the PSO-QN algorithm is $${-1.000, -0.709, -0.003, 0.706, 1.000; 0.127, 0.251, 0.250, 0.248, 0.124}.$$ and its criterion value is $1.331\times10^{-6}$ after rounding.

round(afex_res$BESTDESIGN, 3)
afex_res$BESTVAL

To implement the NestedPSO algorithm, we input the third parameter settings in Section 2.2 which are NESTEDPSO_INFO and LBFGS_NOTRUN.

afex_npso <- DiscrimOD(afex_list, sq_diff, nSupp = 5, dsLower = -1, dsUpper = 1,
                       crit_type = "pair_fixed_true", MaxMinStdVals = NULL,
                       PSO_INFO = NESTEDPSO_INFO, LBFGS_INFO = LBFGS_NOTRUN,
                       seed = NULL, verbose = FALSE)

The best design found by the NestedPSO algorithm is shown below and its criterion value is larger than the one obtained by the PSO-QN algorithm.

round(afex_npso$BESTDESIGN, 3)
afex_npso$BESTVAL

The criterion value obtained by the NestedPSO algorithm seems to be larger than the one found by the PSO-QN algorithm. Does it mean NestedPSO performs better? We draw the curves of directional derivative function for both resulting designs found by each algorithm. Surprisingly, The PSO-QN finds the $T$-optimal design whereas the NestedPSO does not. The reason is PSO might not performs well on solving the inner optimization problem in ([@eq:klCri]).

eqv_psoqn <- equivalence(ngrid = 100, PSO_RESULT = afex_res, 
                         MODEL_INFO = afex_list, DISTANCE = sq_diff,
                         dsLower = -1, dsUpper = 1,
                         crit_type = "pair_fixed_true", 
                         PSO_INFO = PSO_INFO, LBFGS_INFO = LBFGS_INFO)

eqv_npso <- equivalence(ngrid = 100, PSO_RESULT = afex_npso, 
                        MODEL_INFO = afex_list, DISTANCE = sq_diff,
                        dsLower = -1, dsUpper = 1,
                        crit_type = "pair_fixed_true", 
                        PSO_INFO = PSO_INFO, LBFGS_INFO = LBFGS_INFO)

lay <- layout(matrix(1:2, 1, 2, byrow = TRUE))
plot(eqv_psoqn$Grid_1, eqv_psoqn$DirDeriv, type = "l", col = "blue", 
     main = "PSO-QN for Extend A&F Example",
     xlab = "x", ylab = "Directional Derivative"); abline(h = 0)
points(afex_res$BESTDESIGN[,1], rep(0, nrow(afex_res$BESTDESIGN)), pch = 16)
plot(eqv_npso$Grid_1, eqv_npso$DirDeriv, type = "l", col = "blue", 
     main = "NestedPSO for Extend A&F Example",
     xlab = "x", ylab = "Directional Derivative"); abline(h = 0)
points(afex_npso$BESTDESIGN[,1], rep(0, nrow(afex_npso$BESTDESIGN)), pch = 16)

Finally, we show the computational time for both algorithm. The PSO-QN algorithm can find the optimal discrimination design in less than 1 minutes. However, the NestedPSO cannot identify the optimal design even though it has spent more than 18 minutes.

cat(paste0("CPU time of PSO-QN algorithm: ", afex_res$CPUTIME, " seconds.")) 
cat(paste0("CPU time of NestedPSO algorithm: ", afex_npso$CPUTIME, " seconds."))

Discussion and Future Work {#discuss}

Appendix A: Settings for PSO Algorithm {-}

The PSO parameters in the PSO-QN and PSO-S-QN algorithms are set through the getPSOInfo() function. There are 11 inputs, and among them, the user must specify the swarm size and the maximal number of iterations.

getPSOInfo(nSwarm, maxIter, checkConv = 0, freeRun = 0.25, tol = 1e-06, 
           c1 = 2.05, c2 = 2.05, w0 = 0.95, w1 = 0.2, w_var = 0.8, vk = 4)

The complete details are shown in the list below.


nSwarm A integer number of swarm size in PSO algorithm.

maxIter A integer number of maximal PSO iterations.

checkConv A logical value which controls whether PSO checks the stopping criterion during updating procedure. Specify TRUE for PSO to compute the stopping criterion $|f'-f|<\varepsilon$ where $f'$ and $f$ are the objective function values in the previous and current iterations, respectively. The default is FALSE.

freeRun A number between $[0,1]$ that controls the percentage of PSO iterations which are free from examining the stopping criterion. If checkConv = TRUE, the default is 0.25. Otherwise, this value would not affect the algorithm.

tol A small value for the tolerance, $\varepsilon$, in the stopping criterion. If checkConv = TRUE, the default is 1e-6. Otherwise, this value would not affect the algorithm.

c1 The value of cognitive parameter in PSO updating procedure. The default is 2.05.

c2 The value of social parameter in PSO updating procedure. The default is 2.05.

w0 The value of starting inertia weight in PSO updating procedure. The default is 1.2.

w1 The value of ending inertia weight in PSO updating procedure. The default is 0.2.

w_var A number between $[0,1]$ that controls the percentage of iterations that PSO linearly decreases the inertia weight from w0 to w1. The default is 0.8.

vk The value of velocity clamping parameter. The default is 4.


: The PSO algorithm settings in the DiscrimOD package. {#tbl:psoopt}

Appendix B: Settings for L-BFGS Algorithm {-}

The L-BFGS parameters in the PSO-QN and PSO-S-QN algorithms are set through the getLBFGSInfo() function. There are 11 inputs. We suggest to vary the number of trails of L-BFGS algorithm in LBFGS_RETRY and keep the other parameters remain default.

getLBFGSInfo(IF_INNER_LBFGS = TRUE, LBFGS_RETRY = 1, LBFGS_MAXIT = 0,
             LBFGS_LM = 6, FVAL_EPS = 0, GRAD_EPS = 1e-05,
             LINESEARCH_MAXTRIAL = 20, LINESEARCH_MAX = 1e+20,
             LINESEARCH_MIN = 1e-20, LINESEARCH_ARMIJO = 1e-04,
             LINESEARCH_WOLFE = 0.9)

The complete details are shown in the list below.


IF_INNER_LBFGS The logical input TRUE/FALSE to turn on/off the L-BFGS algorithm for the inner optimization problem (minimizing the distance among parameter space). If specified IF_INNER_LBFGS = FALSE, the DiscrimOD package will run the NestedPSO algorithm in @chen2015minimax.

LBFGS_RETRY The integer number of total trials of L-BFGS in computing the minimal distance with randomly generated initial values. The default is 1.

LBFGS_MAXIT The integer number of maximal iteration of L-BFGS algorithm. The default is 0 and L-BFGS stops when it converges.

LBFGS_LM The integer number of corrections to approximate the inverse hessian matrix. The default is 6.

FVAL_EPS The tolerance value, $\varepsilon_f$, of the stopping criterion on objective function value, $|f'-f|/f<\varepsilon_f$ where $f'$ and $f$ are the objective function values in the previous and current iterations, respectively. The default is 0 and L-BFGS ignores this stopping criterion.

GRAD_EPS The tolerance value, $\varepsilon_g$, of the stopping criterion on gradients, $\|g\|/\|x\|<\varepsilon_g$ where $g$ is the gradient, $x$ is the current position and $\|a\|=\sqrt{a^\top a}$. The default is 1e-5.

LINESEARCH_MAXTRIAL The integer number of maximal trial of More-Thuente line search routine The default is 20.

LINESEARCH_MAX The maximal step size in the line search routine. The default is 1e20.

LINESEARCH_MIN The minimal step size in the line search routine. The default is 1e-20.

LINESEARCH_ARMIJO A parameter to control the accuracy of the line search routine. The default value is 1e-4. This parameter should be greater than zero and smaller than 0.5.

LINESEARCH_WOLFE A coefficient for the Wolfe condition in the line search routine. The default value is 0.9. This parameter should be greater than LINESEARCH_ARMIJO parameter and smaller than 1.0.

FD_DELTA A small value for the gap of finite difference method for computing the gradient. The default is 1e-3.


: The L-BFGS algorithm settings in the DiscrimOD package. {#tbl:lbfgsopt}

Appendix C: Codes for Distance Functions in R and C++ {-}

Homoscedastic Normal distribution ($T$-optimal design criterion)

# R Code
l2_diff <- function(xt, xr) (xt - xr)^2
# C++ Code
l2_diff_cpp <- cppFunction('Rcpp::NumericVector l2_diff(SEXP xt, SEXP xr) {
  arma::rowvec eta_t = Rcpp::as<arma::rowvec>(xt);
  arma::rowvec eta_r = Rcpp::as<arma::rowvec>(xr);
  arma::rowvec div = (eta_t - eta_r) % (eta_t - eta_r);
  return Rcpp::wrap(div);
}', depends = "RcppArmadillo")

Heteroscedastic Normal distribution

# R Code
heter_norm <- function(xt, xr) {
  var_t <- xt^2
  var_r <- xr^2
  (var_t + (xt - xr)^2)/var_r - log(var_t/var_r)
}
# C++ Code
heter_norm_cpp <- cppFunction('Rcpp::NumericVector heter_norm(SEXP xt, SEXP xr) {
  arma::rowvec eta_t = Rcpp::as<arma::rowvec>(xt);
  arma::rowvec eta_r = Rcpp::as<arma::rowvec>(xr);
  arma::rowvec var_t = eta_t % eta_t;
  arma::rowvec var_r = eta_r % eta_r;
  arma::rowvec div = (var_t + arma::pow(eta_t - eta_r, 2))/var_r - arma::log(var_t/var_r);
  return Rcpp::wrap(div);
}', depends = "RcppArmadillo")

Log-normal distribution (A) in @lopez2007optimal

# R Code
log_norm_A <- function(xt, xr) {
  vSQ <- 1.0
  s_t <- log(1.0 + (vSQ/(xt^2)))
  s_r <- log(1.0 + (vSQ/(xr^2)))
  mu_t <- log(xt) - s_t
  mu_r <- log(xr) - s_r
  0.5*log(s_r/s_t) - (s_r - s_t - (mu_r - mu_t)^2)/(2*s_r)
}
# C++ Code
log_norm_A_cpp <- cppFunction('Rcpp::NumericVector log_norm_A(SEXP xt, SEXP xr) {
  double vSQ = 1.0;
  arma::rowvec eta_t = Rcpp::as<arma::rowvec>(xt);
  arma::rowvec eta_r = Rcpp::as<arma::rowvec>(xr);
  arma::rowvec s_t = arma::log(1.0 + (vSQ/(eta_t % eta_t)));
  arma::rowvec s_r = arma::log(1.0 + (vSQ/(eta_r % eta_r)));
  arma::rowvec mu_t = arma::log(eta_t) - s_t;
  arma::rowvec mu_r = arma::log(eta_r) - s_r;
  arma::rowvec div = 0.5*arma::log(s_r/s_t) - 
    (s_r - s_t - (mu_r - mu_t) % (mu_r - mu_t))/(2*s_r);
  return Rcpp::wrap(div);
}', depends = "RcppArmadillo")

Log-normal distribution (B) in @lopez2007optimal

# R Code
log_norm_B <- function(xt, xr) {
  sigsq <- 1.0
  var_t <- (exp(sigsq) - 1.0)*(xt^2)
  var_r <- (exp(sigsq) - 1.0)*(xr^2)
  mu_t <- log(xt) - 0.5*log(1.0 + (var_t/(xt^2)))
  mu_r <- log(xr) - 0.5*log(1.0 + (var_r/(xr^2)))
  ((mu_r - mu_t)^2)/(2*sigsq)
}
# C++ Code
log_norm_B_cpp <- cppFunction('Rcpp::NumericVector log_norm_B(SEXP xt, SEXP xr) {
  double sigsq = 1.0;
  arma::rowvec eta_t = Rcpp::as<arma::rowvec>(xt);
  arma::rowvec eta_r = Rcpp::as<arma::rowvec>(xr);
  arma::rowvec var_t = (std::exp(sigsq) - 1.0)*(eta_t % eta_t);
  arma::rowvec var_r = (std::exp(sigsq) - 1.0)*(eta_r % eta_r);
  arma::rowvec mu_t = arma::log(eta_t) - 0.5*arma::log(1.0 + (var_t/(eta_t % eta_t)));
  arma::rowvec mu_r = arma::log(eta_r) - 0.5*arma::log(1.0 + (var_r/(eta_r % eta_r)));
  arma::rowvec div = ((mu_r - mu_t) % (mu_r - mu_t))/(2*sigsq);
  return Rcpp::wrap(div);
}', depends = "RcppArmadillo")

Log-normal distribution (C) in @lopez2007optimal

# R Code
log_norm_C <- function(xt, xr) {
  c <- 1; d <- 1
  var_t <- d*exp(c*xt)
  var_r <- d*exp(c*xr)
  s_t <- log(1.0 + (var_t/(xt^2)))
  s_r <- log(1.0 + (var_r/(xr^2)))
  mu_t <- log(xt) - s_t
  mu_r <- log(xr) - s_r
  0.5*log(s_r/s_t) - (s_r - s_t - (mu_r - mu_t)*(mu_r - mu_t))/(2*s_r)
}
# C++ Code
log_norm_C_cpp <- cppFunction('Rcpp::NumericVector log_norm_C(SEXP xt, SEXP xr) {
  double c = 1.0; double d = 1.0;
  arma::rowvec eta_t = Rcpp::as<arma::rowvec>(xt);
  arma::rowvec eta_r = Rcpp::as<arma::rowvec>(xr);
  arma::rowvec var_t = d * arma::exp(c*eta_t);
  arma::rowvec var_r = d * arma::exp(c*eta_r);
  arma::rowvec s_t = arma::log(1.0 + (var_t/(eta_t % eta_t)));
  arma::rowvec s_r = arma::log(1.0 + (var_r/(eta_r % eta_r)));
  arma::rowvec mu_t = arma::log(eta_t) - s_t;
  arma::rowvec mu_r = arma::log(eta_r) - s_r;
  arma::rowvec div = 0.5*arma::log(s_r/s_t) - 
    (s_r - s_t - (mu_r - mu_t) % (mu_r - mu_t))/(2*s_r);
  return Rcpp::wrap(div);
}', depends = "RcppArmadillo")

Binomial distribution

# R Code
logit_diff <- function(xt, xr) {
  exp_t <- exp(xt) 
  exp_r <- exp(xr)
  mu_t <- exp_t/(1 + exp_t)
  mu_r <- exp_r/(1 + exp_r)
  mu_t*(log(mu_t) - log(mu_r)) + (1 - mu_t)*(log(1.0 - mu_t) - log(1.0 - mu_r))
}
# C++ Code
logit_diff_cpp <- cppFunction('Rcpp::NumericVector logit_diff(SEXP xt, SEXP xr) {
  arma::rowvec eta_t = Rcpp::as<arma::rowvec>(xt);
  arma::rowvec eta_r = Rcpp::as<arma::rowvec>(xr);
  arma::rowvec exp_t = arma::exp(eta_t);
  arma::rowvec exp_r = arma::exp(eta_r);
  arma::rowvec mu_t = exp_t/(1.0 + exp_t); 
  arma::rowvec mu_r = exp_r/(1.0 + exp_r);
  arma::rowvec div = mu_t % (arma::log(mu_t) - arma::log(mu_r)) + 
    (1.0 - mu_t) % (arma::log(1.0 - mu_t) - arma::log(1.0 - mu_r));
  return Rcpp::wrap(div);
}', depends = "RcppArmadillo")

Gamma distribution

# R Code
gamma_diff <- function(xt, xr) log(xr/xt) + (xt - xr)/xr
# C++ Code
gamma_diff_cpp <- cppFunction('Rcpp::NumericVector gamma_diff(SEXP xt, SEXP xr) {
  arma::rowvec eta_t = Rcpp::as<arma::rowvec>(xt);
  arma::rowvec eta_r = Rcpp::as<arma::rowvec>(xr);
  arma::rowvec div = arma::log(eta_r/eta_t) + (eta_t - eta_r)/eta_r;
  return Rcpp::wrap(div);
}', depends = "RcppArmadillo")

Reference



PingYangChen/DiscrimOD documentation built on Jan. 30, 2022, 5:25 p.m.