EPSGO: Efficient Parameter Selection via Global Optimization
In mwsill/c060: Extended Inference for Lasso and Elastic-Net Regularized Cox and Generalized Linear Models

Description Usage Arguments Details Value Author(s) References Examples

Finds an optimal solution for the Q.func function.

epsgo(Q.func, bounds,  round.n=5, parms.coding="none",
  fminlower=0, flag.find.one.min =FALSE,
  show=c("none", "final", "all"), N= NULL, maxevals = 500,
  pdf.name=NULL,  pdf.width=12,  pdf.height=12, my.mfrow=c(1,1),
  verbose=TRUE, seed=123,  ...  )

`Q.func`	name of the function to be minimized.
`bounds`	bounds for parameters
`round.n`	number of digits after comma, default: 5
`parms.coding`	parmeters coding: none or log2, default: none.
`fminlower`	minimal value for the function Q.func, default is 0.
`flag.find.one.min`	do you want to find one min value and stop? Default: FALSE
`show`	show plots of DIRECT algorithm: none, final iteration, all iterations. Default: none
`N`	define the number of start points, see details.
`maxevals`	the maximum number of DIRECT function evaluations, default: 500.
`pdf.name`	pdf name
`pdf.width`	default: 12
`pdf.height`	default: 12
`my.mfrow`	default: c(1,1)
`verbose`	verbose? default: TRUE.
`seed`	seed
`...`	additional argument(s)

if the number of start points (N) is not defined by the user, it will be defined dependent on the dimensionality of the parameter space. N=10D+1, where D is the number of parameters, but for high dimensional parameter space with more than 6 dimensions, the initial set is restricted to 65. However for one-dimensional parameter space the N is set to 21 due to stability reasons.

The idea of EPSGO (Efficient Parameter Selection via Global Optimization): Beginning from an intial Latin hypercube sampling containing N starting points we train an Online GP, look for the point with the maximal expected improvement, sample there and update the Gaussian Process(GP). Thereby it is not so important that GP really correctly models the error surface of the SVM in parameter space, but that it can give a us information about potentially interesting points in parameter space where we should sample next. We continue with sampling points until some convergence criterion is met.

DIRECT is a sampling algorithm which requires no knowledge of the objective function gradient. Instead, the algorithm samples points in the domain, and uses the information it has obtained to decide where to search next. The DIRECT algorithm will globally converge to the maximal value of the objective function. The name DIRECT comes from the shortening of the phrase 'DIviding RECTangles', which describes the way the algorithm moves towards the optimum.

The code source was adopted from MATLAB originals, special thanks to Holger Froehlich.

`fmin`	minimal value of Q.func on the interval defined by bounds.
`xmin`	corresponding parameters for the minimum
`iter`	number of iterations
`neval`	number of visited points
`maxevals`	the maximum number of DIRECT function evaluations
`seed`	seed
`bounds`	bounds for parameters
`Q.func`	name of the function to be minimized.
`points.fmin`	the set of points with the same fmin
`Xtrain`	visited points
`Ytrain`	the output of Q.func at visited points Xtrain
`gp.seed`	seed for Gaussian Process
`model.list`	detailed information of the search process

Natalia Becker natalia.becker at dkfz.de

Froehlich, H. and Zell, A. (2005) "Effcient parameter selection for support vector machines in classification and regression via model-based global optimization" In Proc. Int. Joint Conf. Neural Networks, 1431-1438 .

Sill M., Hielscher T., Becker N. and Zucknick M. (2014), c060: Extended Inference with Lasso and Elastic-Net Regularized Cox and Generalized Linear Models, Journal of Statistical Software, Volume 62(5), pages 1–22. http://www.jstatsoft.org/v62/i05/

## Not run: 
set.seed(1010)
n=1000;p=100
nzc=trunc(p/10)
x=matrix(rnorm(n*p),n,p)
beta=rnorm(nzc)
fx= x[,seq(nzc)] %*% beta
eps=rnorm(n)*5
y=drop(fx+eps)
px=exp(fx)
px=px/(1+px)
ly=rbinom(n=length(px),prob=px,size=1)
set.seed(1011)

 
nfolds = 10
set.seed(1234)
foldid <- balancedFolds(class.column.factor=y.classes, cross.outer=nfolds)

# y - binomial
y.classes<-ifelse(y>= median(y),1, 0)
bounds <- t(data.frame(alpha=c(0, 1)))
colnames(bounds)<-c("lower","upper")
 
fit <- epsgo(Q.func="tune.glmnet.interval", 
             bounds=bounds, 
             parms.coding="none", 
             seed = 1234, 
             show="none",
             fminlower = -100,
             x = x, y = y.classes, family = "binomial", 
             foldid = foldid,
             type.min = "lambda.1se",
             type.measure = "mse")
summary(fit)

# y - multinomial: low - low 25%, middle - (25,75)-quantiles, high - larger 75%.
y.classes<-ifelse(y <= quantile(y,0.25),1, ifelse(y >= quantile(y,0.75),3, 2))
bounds <- t(data.frame(alpha=c(0, 1)))
colnames(bounds)<-c("lower","upper")
 
fit <- epsgo(Q.func="tune.glmnet.interval", 
             bounds=bounds, 
             parms.coding="none", 
             seed = 1234, 
             show="none",
             fminlower = -100,
             x = x, y = y.classes, family = "multinomial", 
             foldid = foldid,
             type.min = "lambda.1se",
             type.measure = "mse")
summary(fit)

##poisson
N=500; p=20
nzc=5
x=matrix(rnorm(N*p),N,p)
beta=rnorm(nzc)
f = x[,seq(nzc)]
mu=exp(f)
y.classes=rpois(N,mu)

nfolds = 10
set.seed(1234)
foldid <- balancedFolds(class.column.factor=y.classes, cross.outer=nfolds)


fit <- epsgo(Q.func="tune.glmnet.interval", 
             bounds=bounds, 
             parms.coding="none", 
             seed = 1234, 
             show="none",
             fminlower = -100,
             x = x, y = y.classes, family = "poisson", 
             foldid = foldid,
             type.min = "lambda.1se",
             type.measure = "mse")
summary(fit)

#gaussian
set.seed(1234)
x=matrix(rnorm(100*1000,0,1),100,1000)
y <- x[1:100,1:1000]%*%c(rep(2,5),rep(-2,5),rep(.1,990))

foldid <- rep(1:10,each=10)

fit <- epsgo(Q.func="tune.glmnet.interval", 
             bounds=bounds, 
             parms.coding="none", 
             seed = 1234, 
             show="none",
             fminlower = -100,
             x = x, y = y, family = "gaussian", 
             foldid = foldid,
             type.min = "lambda.1se",
             type.measure = "mse")
summary(fit)  

# y - cox in vingette

## End(Not run)

mwsill/c060 documentation built on Nov. 26, 2019, 12:14 a.m.

mwsill/c060 index

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

mwsill/c060
Extended Inference for Lasso and Elastic-Net Regularized Cox and Generalized Linear Models

EPSGO: Efficient Parameter Selection via Global Optimization
In mwsill/c060: Extended Inference for Lasso and Elastic-Net Regularized Cox and Generalized Linear Models

Description

Usage

Arguments

Details

Value

Author(s)

References

Examples

Related to EPSGO in mwsill/c060...

R Package Documentation

Browse R Packages

We want your feedback!

mwsill/c060 Extended Inference for Lasso and Elastic-Net Regularized Cox and Generalized Linear Models

EPSGO: Efficient Parameter Selection via Global Optimization In mwsill/c060: Extended Inference for Lasso and Elastic-Net Regularized Cox and Generalized Linear Models

Description

Usage

Arguments

Details

Value

Author(s)

References

Examples

Related to EPSGO in mwsill/c060...

R Package Documentation

Browse R Packages

We want your feedback!

mwsill/c060
Extended Inference for Lasso and Elastic-Net Regularized Cox and Generalized Linear Models

EPSGO: Efficient Parameter Selection via Global Optimization
In mwsill/c060: Extended Inference for Lasso and Elastic-Net Regularized Cox and Generalized Linear Models