nrbm: Convex and non-convex risk minimization with L2...

Description Usage Arguments Value Functions References Examples

Description

Use algorithm of Do and Artieres, JMLR 2012 to find w minimizing: f(w) = 0.5*LAMBDA*l2norm(w) + riskFun(w) where riskFun is either a convex or a non-convex risk function.

Usage

1
2
3
4
5
6
nrbm(riskFun, LAMBDA = 1, MAX_ITER = 1000L, EPSILON_TOL = 0.01,
  w0 = 0, maxCP = 50L, convexRisk = is.convex(riskFun),
  LowRankQP.method = "LU", line.search = !convexRisk)

nrbmL1(riskFun, LAMBDA = 1, MAX_ITER = 300L, EPSILON_TOL = 0.01,
  w0 = 0, maxCP = +Inf, line.search = FALSE)

Arguments

riskFun

the risk function to use in the optimization (e.g.: hingeLoss, softMarginVectorLoss). The function must evaluate the loss value and its gradient for a given point vector (w). The function must return the given point vector w, with attributes "lvalue" and "gradient" set.

LAMBDA

control the regularization strength in the optimization process. This is the value used as coefficient of the regularization term.

MAX_ITER

the maximum number of iteration to perform. The function stop with a warning message if the number of iteration exceed this value

EPSILON_TOL

a numeric value between 0 and 1 controling stoping criteria: the optimization end when the ratio between the optimization gap and the objective value is below this threshold

w0

initial weight vector where optimization start

maxCP

mximal number of cutting plane to use to limit memory footprint

convexRisk

a length 1 logical telling if the risk function riskFun is convex. If TRUE, use CRBM algorithm; if FALSE use NRBM algorithm from Do and Artieres, JMLR 2012

LowRankQP.method

a single character value defining the method used by LowRankQP (should be either "LU" or "CHOL")

line.search

a logical, when TRUE use line search to speed up convergence

Value

the optimal weight vector (w)

Functions

References

Do and Artieres Regularized Bundle Methods for Convex and Non-Convex Risks JMLR 2012

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
  # -- Create a 2D dataset with the first 2 features of iris, with binary labels
  x <- data.matrix(iris[1:2])

  # -- Add a constant dimension to the dataset to learn the intercept
  x <- cbind(intercept=1000,x)

  # -- train scalar prediction models with maxMarginLoss and fbetaLoss
  models <- list(
    svm_L1 = nrbmL1(hingeLoss(x,iris$Species=="setosa"),LAMBDA=1),
    svm_L2 = nrbm(hingeLoss(x,iris$Species=="setosa"),LAMBDA=1),
    f1_L1 = nrbmL1(fbetaLoss(x,iris$Species=="setosa"),LAMBDA=1),
    tsvm_L2 = nrbm(hingeLoss(x,
                   ifelse(iris$Species=="versicolor",NA,iris$Species=="setosa")),
                   LAMBDA=1)
  )

  # -- Plot the dataset and the predictions
  plot(x[,-1],pch=ifelse(iris$Species=="setosa",1,2),main="dataset & hyperplanes")
  legend('bottomright',legend=names(models),col=seq_along(models),lty=1,cex=0.75,lwd=3)
  for(i in seq_along(models)) {
    w <- models[[i]]
    if (w[3]!=0) abline(-w[1]*1000/w[3],-w[2]/w[3],col=i,lwd=3)
  }


  # -- fit a least absolute deviation linear model on a synthetic dataset
  # -- containing 196 meaningful features and 4 noisy features. Then
  # -- check if the model has detected the noise
  set.seed(123)
  X <- matrix(rnorm(4000*200), 4000, 200)
  beta <- c(rep(1,ncol(X)-4),0,0,0,0)
  Y <- X%*%beta + rnorm(nrow(X))
  w <- nrbm(ladRegressionLoss(X/100,Y/100),maxCP=50)
  barplot(as.vector(w))

bmrm documentation built on May 2, 2019, 2:49 p.m.