nrbm: Convex and non-convex risk minimization with L2...
In bmrm: Bundle Methods for Regularized Risk Minimization Package

Description Usage Arguments Value Functions References Examples

Use algorithm of Do and Artieres, JMLR 2012 to find w minimizing: f(w) = 0.5*LAMBDA*l2norm(w) + riskFun(w) where riskFun is either a convex or a non-convex risk function.

nrbm(riskFun, LAMBDA = 1, MAX_ITER = 1000L, EPSILON_TOL = 0.01,
  w0 = 0, maxCP = 50L, convexRisk = is.convex(riskFun),
  LowRankQP.method = "LU", line.search = !convexRisk)

nrbmL1(riskFun, LAMBDA = 1, MAX_ITER = 300L, EPSILON_TOL = 0.01,
  w0 = 0, maxCP = +Inf, line.search = FALSE)

`riskFun`	the risk function to use in the optimization (e.g.: hingeLoss, softMarginVectorLoss). The function must evaluate the loss value and its gradient for a given point vector (w). The function must return the given point vector w, with attributes "lvalue" and "gradient" set.
`LAMBDA`	control the regularization strength in the optimization process. This is the value used as coefficient of the regularization term.
`MAX_ITER`	the maximum number of iteration to perform. The function stop with a warning message if the number of iteration exceed this value
`EPSILON_TOL`	a numeric value between 0 and 1 controling stoping criteria: the optimization end when the ratio between the optimization gap and the objective value is below this threshold
`w0`	initial weight vector where optimization start
`maxCP`	mximal number of cutting plane to use to limit memory footprint
`convexRisk`	a length 1 logical telling if the risk function riskFun is convex. If TRUE, use CRBM algorithm; if FALSE use NRBM algorithm from Do and Artieres, JMLR 2012
`LowRankQP.method`	a single character value defining the method used by LowRankQP (should be either "LU" or "CHOL")
`line.search`	a logical, when TRUE use line search to speed up convergence

the optimal weight vector (w)

nrbm: original L2-regularized version of nrbm
nrbmL1: L1-regularized version of nrbm that can only handle convex risk

Do and Artieres Regularized Bundle Methods for Convex and Non-Convex Risks JMLR 2012

  # -- Create a 2D dataset with the first 2 features of iris, with binary labels
  x <- data.matrix(iris[1:2])

  # -- Add a constant dimension to the dataset to learn the intercept
  x <- cbind(intercept=1000,x)

  # -- train scalar prediction models with maxMarginLoss and fbetaLoss
  models <- list(
    svm_L1 = nrbmL1(hingeLoss(x,iris$Species=="setosa"),LAMBDA=1),
    svm_L2 = nrbm(hingeLoss(x,iris$Species=="setosa"),LAMBDA=1),
    f1_L1 = nrbmL1(fbetaLoss(x,iris$Species=="setosa"),LAMBDA=1),
    tsvm_L2 = nrbm(hingeLoss(x,
                   ifelse(iris$Species=="versicolor",NA,iris$Species=="setosa")),
                   LAMBDA=1)
  )

  # -- Plot the dataset and the predictions
  plot(x[,-1],pch=ifelse(iris$Species=="setosa",1,2),main="dataset & hyperplanes")
  legend('bottomright',legend=names(models),col=seq_along(models),lty=1,cex=0.75,lwd=3)
  for(i in seq_along(models)) {
    w <- models[[i]]
    if (w[3]!=0) abline(-w[1]*1000/w[3],-w[2]/w[3],col=i,lwd=3)
  }


  # -- fit a least absolute deviation linear model on a synthetic dataset
  # -- containing 196 meaningful features and 4 noisy features. Then
  # -- check if the model has detected the noise
  set.seed(123)
  X <- matrix(rnorm(4000*200), 4000, 200)
  beta <- c(rep(1,ncol(X)-4),0,0,0,0)
  Y <- X%*%beta + rnorm(nrow(X))
  w <- nrbm(ladRegressionLoss(X/100,Y/100),maxCP=50)
  barplot(as.vector(w))