OptStiefelGBB: Optimization on Stiefel manifold
In TRES: Tensor Regression with Envelope Structure

Description Usage Arguments Details Value References Examples

Curvilinear search algorithm for optimization on Stiefel manifold developed by Wen and Yin (2013).

1	OptStiefelGBB(X, fun, opts = NULL, ...)

`X`	Initial value to start the optimization. A n by k orthonormal matrix such that X^T X = I_k.
`fun`	The function that returns the objective function value and its gradient. The syntax for `fun` is `fun(X, data1, data2)` where `data1, data2` are additional data passed to `...`.
`opts`	A list specifying additional user-defined arguments for the curvilinear search algorithm. Some important ones are listed in the following: `maxiter`: The maximal number of iterations. `xtol`: The convergence tolerance for X, e.g., \|\|X^{(t)} - X^{(t-1)}\|\|_F/√{k}. `gtol`: The convergence tolerance for the gradient of the Lagrangian function, e.g., \|\|G^{(t)} - X^{(t)} (G^{(t)})^T X^{(t)}\|\|_F. `ftol`: The convergence tolerance for objective function F, e.g., \|F^{(t)} - F^{(t-1)}\|/(1+\|F^{(t-1)}\|). Usually, `max{xtol, gtol} > ftol`. The default values are: `maxiter=500; xtol=1e-08; gtol=1e-08; ftol=1e-12.`
`...`	Additional input passed to `fun`.

The calling syntax is OptStiefelGBB(X, fun, opts, data1, data2), where fun(X, data1, data2) returns the objective function value and its gradient.

For example, for n by k matrix X, the optimization problem is

min_{X} -tr(X^T W X), \mbox{ such that } X^T X = I_k.

The objective function and its gradient are

F(X) = -tr(X^T W X), \; G(X) = - 2 W X.

Then we need to provide the function fun(X, W) which returns F(X) and G(X). See Examples for details.

For more details of the termination rules and the tolerances, we refer the interested readers to Section 5.1 of Wen and Yin (2013).

X

The converged solution of the optimization problem.

out

Output information, including estimation error, function value, iteration times etc.

nfe: The total number of line search attempts.
msg: Message: "convergence" | "exceed max iteration".
feasi: The feasibility of solution: ||X^TX - I_k||_F.
nrmG: The convergence criterion based on the projected gradient ||G - X G^T X||_F.
fval: The objective function value F(X) at termination.
iter: The number of iterations.

Wen, Z. and Yin, W., 2013. A feasible method for optimization with orthogonality constraints. Mathematical Programming, 142(1-2), pp.397-434.

n <- 1000
k <- 6

# Randomly generated matrix M
W <- matrix(rnorm(n^2), n, n)
W <- t(W) %*% W

# Randomly generated orthonormal initial matrix
X0 <- matrix(rnorm(n*k), n, k)
X0 <- qr.Q(qr(X0))

# The objective function and its gradient
fun <- function(X, W){
  F <- - sum(diag(t(X) %*% W %*% X))
  G <- - 2*(W %*% X)
  return(list(F = F, G = G))
}

# Options list
opts<-list(record = 0, maxiter = 1000, xtol = 1e-5, gtol = 1e-5, ftol = 1e-8)

# Main part
output <- OptStiefelGBB(X0, fun, opts, W)
X <- output$X
out <- output$out