optimization_L1: optimization_L1

Description Usage Arguments Details Value References

Description

Subgradient-based quasi-Newton method for non-differentiable optimization.

Usage

1
2
optimization_L1(w,X,y,nHidden, verbose= FALSE,lambda,lambda2, optimtol, prgmtol,
                      maxIter, decsuff)

Arguments

w

(numeric,n) weights and biases.

X

(numeric, n x p) incidence matrix.

y

(numeric, n) the response data-vector.

nHidden

(positive integer, 1 x h) matrix, h indicates the number of hidden-layers and nHidden[1,h] indicates the neurons of the hth hidden-layer.

verbose

logical, if TRUE prints detail history.

lambda

numeric, lagrange multiplier for L1 norm penalty on parameters.

lambda2

numeric, lagrange multiplier for L2 norm penalty on parameters.

optimtol

numeric, a tiny number useful for checking convergenge of subgradients.

prgmtol

numeric, a tiny number useful for checking convergenge of parameters of NN.

maxIter

positive integer, maximum number of epochs(iterations) to train, default 100.

decsuff

numeric, a tiny number useful for checking change of loss function.

Details

It is based on choosing a sub-gradient with minimum norm as a steepest descent direction and taking a step resembling Newton iteration in this direction with a Hessian approximation (Nocedal, 1980). An active-set method is adopted to set some parameters to exactly zero (Krishnan et al., 2007). At each iteration, the non-zero parameters are divided into two sets: the working set containing the non-zero variables, and the active set containing the sufficiently zero-values variables. Then a Newton step is taken along the working set. A subgradient-based quasi-Newton method ensures that the step size taken in the active variables is such that they do not violate the sufficiently zero-value variables constraint. A projected steepest descent is taken to set some parameters to exactly zero.

Value

A vector of weights and biases.

References

Nocedal, J. 1980. Updating quasi-newton matrices with limited storage. Mathematics of Computation, 35(35), 773-782.

Krishnan, D., Lin, P., and Yip, A., M. 2007. A primal-dual active-set method for non-negativity constrained total variation deblurring problems. IEEE Transactions on Image Processing, 16(11), 2766-2777.


snnR documentation built on May 2, 2019, 8:54 a.m.

Related to optimization_L1 in snnR...