Description Usage Arguments Details Value References See Also Examples
The mixsqp
function uses a Sequential Quadratic
Programming (SQP) algorithm to find the maximum likelihood
estimates of mixture proportions in a (finite) mixture model. More
generally, mixsqp
solves the corresponding constrained,
convex optimization problem, which is given below (see
‘Details’). See ‘References’ for more details about
the SQP algorithm.
1 2 3 4 
L 
Matrix specifying the optimization problem to be solved.
In the context of mixturemodel fitting, 
w 
An optional numeric vector, with one entry for each row of

x0 
An optional numeric vector providing an initial estimate
of the solution to the optimization problem. It should contain only
finite, nonmissing, nonnegative values, and all entries of

control 
A list of parameters controlling the behaviour of the optimization algorithm. See ‘Details’. 
mixsqp
solves the following optimization problem.
Let L be a matrix with n rows and m columns
containing only nonnegative entries, and let w = (w_1,
…, w_n) be a vector of nonnegative "weights". mixsqp
computes the value of vector x = (x_1, …, x_m)
minimizing the following objective function,
f(x) = ∑_{j=1}^n w_j \log (∑_{k=1}^m L_{jk} x_k),
subject to the constraint that x lie within the simplex; that is, the entries of x are nonnegative and sum to 1. Implicitly, there is an additional constraint L*x > 0 in order to ensure that the objective has a finite value. In practice, this constraint only needs to be checked for the initial estimate to ensure that it holds for all subsequent iterates.
If all weights are equal, solving this optimization problem corresponds to finding the maximumlikelihood estimate of the mixture proportions x given n independent data points drawn from a mixture model with m components. In this case, L_{jk} is the likelihood for mixture component k and data point j.
The Expectation Maximization (EM) algorithm can be used to solve this optimization problem, but it is intolerably slow in many interesting cases, and mixsqp is much faster.
A special feature of this optimization problem is that the gradient
of the objective does not change with rescaling; for example, if
all the entries of matrix L
are multiplied by 100, the
gradient does not change. A practical benefit of this property is
that the optimization algorithm will behave similarly irrespective
of the scale of L
; for example, the same value for the
convergence tolerance convtol.sqp
will have the same effect
at different scales. The one exception is the eps
control
parameter, and for this reason the effect of eps
is
automatically adjusted to reflect the scale of L
; see the
description of this parameter below for more details.
A related feature is that the solution to the optimization problem
is invariant to rescaling the rows of L
; for example, the
solution will remain the same after all the entries in a row of
L
are multiplied by 10. A simple normalization scheme
divides each row by the largest entry in the row so that all
entries of L
are at most 1: L < L / apply(L,1,max)
Occasionally, it can be helpful to normalize the rows when some of
the entries are unusually large or unusually small. This can help
to avoid numerical overflow or underflow errors.
The SQP algorithm is implemented using the Armadillo C++ linear
algebra library, which can automatically take advantage of
multithreaded matrix computations to speed up mixsqp
for
large L
matrices, but only when R has been configured with a
multithreaded BLAS/LAPACK library (e.g., OpenBLAS).
The control
argument is a list in which any of the
following named components will override the default optimization
algorithm settings (as they are defined by
mixsqp_control_defaults
):
convtol.sqp
A small, nonnegative number
specifying the convergence tolerance for SQP algorithm; convergence
is reached when the maximum dual residual in the KarushKuhnTucker
(KKT) optimality conditions is less than or equal to
convtol.sqp
. Smaller values will result in more stringent
convergence criteria and more accurate solutions, at the expense of
greater computation time. Note that changes to this tolerance
parameter may require respective changes to
convtol.activeset
and/or zero.threshold.searchdir
to
obtain reliable convergence.
convtol.activeset
A small, nonnegative number
specifying the convergence tolerance for the activeset
step. Smaller values will result in higher quality search
directions for the SQP algorithm but possibly a greater
periteration computational cost. Note that changes to this
tolerance parameter can affect how reliably the SQP convergence
criterion is satisfied, as determined by convtol.sqp
.
zero.threshold.solution
A small, nonnegative
number used to determine the "active set"; that is, it determines
which entries of the solution are exactly zero. Any entries that
are less than or equal to zero.threshold.solution
are
considered to be exactly zero. Larger values of
zero.threshold.solution
may lead to speedups for matrices
with many columns, at the (slight) risk of prematurely zeroing some
coordinates.
zero.threshold.searchdir
A small, nonnegative
number used to determine when the search direction in the
activeset step is considered "small enough". Note that changes to
this tolerance parameter can affect how reliably the SQP
convergence criterion is satisfied, as determined by
convtol.sqp
, so choose this parameter carefully.
suffdecr.linesearch
This parameter determines how stringent t he "sufficient decrease" condition is for accepting a step size in the backtracking line search. Larger values will make the condition more stringent. This should be a positive number less than 1.
stepsizereduce
The multiplicative factor for decreasing the step size in the backtracking line search. Smaller values will yield a faster backtracking line search at the expense of a less finegrained search. Should be a positive number less than 1.
minstepsize
The smallest step size accepted by the line search step. Should be a number greater than 0 and at most 1.
eps
A small, nonnegative number that is added to the
terms inside the logarithms to sidestep computing logarithms of
zero. This prevents numerical problems at the cost of introducing a
small inaccuracy in the solution. Increasing this number may lead
to faster convergence but possibly a less accurate solution. Since
an appropriate value of this number will depend on the "scale" of
L
, eps
is automatically scaled separately for each
row of L
; specifically, the ith modified logarithm term
becomes log(L[i,]*x + ei)
, in which ei
is set to
eps * max(L[i,])
.
delta
A small, nonnegative number added to the diagonal of the Hessian to improve numerical stability (and possibly the speed) when computing the search directions in the activeset step.
maxiter.sqp
Maximum number of SQP iterations to run before reporting a convergence failure; that is, the maximum number of quadratic subproblems that will be solved by the activeset method.
maxiter.activeset
Maximum number of activeset iterations taken to solve each of the quadratic subproblems.
verbose
If verbose = TRUE
, the algorithm's
progress and a summary of the optimization settings are printed to
the console. The algorithm's progress is displayed in a table with
one row per SQP (outer loop) iteration, and with the following
columns: "iter", the (outer loop) SQP iteration; "objective", the
value of the objective function (see f(x)) at the current
estimate of the solution, x; "max(rdual)", the maximum "dual
residual" in the KarushKuhnTucker (KKT) conditions, which is used
to monitor convergence (see convtol.sqp
); "nnz", the number
of nonzero coordinates in the current estimate, as determined by
zero.threshold.solution
; "max.diff", the maximum difference
in the estimates between two successive iterations; "nqp", the
number of (inner loop) activeset iterations taken to solve the
quadratic subproblem; "nls", the number of iterations in the
backtracking line search.
A list object with the following elements:
x 
The solution to the convex optimization problem. 
value 
The value of the objective function, f(x), at

status 
A character string describing the status of the algorithm upon termination. 
progress 
A data frame containing more detailed information
about the algorithm's progress. The data frame has one row per SQP
iteration. For an explanation of the columns, see the description
of the 
Y. Kim, P. Carbonetto, M. Stephens and M. Anitescu (2018). A fast algorithm for maximum likelihood estimation of mixture proportions using sequential quadratic programming. arXiv:1806.01412 https://arxiv.org/abs/1806.01412.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19  set.seed(1)
n < 1e5
m < 10
w < rep(1,n)/n
L < simulatemixdata(n,m)$L
out.mixsqp < mixsqp(L,w)
f < mixobjective(L,out.mixsqp$x,w)
print(f,digits = 16)
# We can also compare this result with solution found from an
# interiorpoint approach called via the "KWDual" function from the
# REBayes package. (This requires installation of the MOSEK
# optimization library as well as the REBayes package, so we have
# made this step optional.)
## Not run:
out.kwdual < mixkwdual(L,w)
print(mixobjective(L,out.kwdual$x,w),digits = 16)
## End(Not run)

Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.