BOSO: BOSO and associates functions

Description Usage Arguments Details Value Author(s) Examples

View source: R/BOSO.R

Description

Fit a ridge linear regression by a feature selection model conducted by BOSO MILP. The package 'cplexAPI' is necessary to run it.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
BOSO(
  x,
  y,
  xval,
  yval,
  IC = "eBIC",
  IC.blocks = NULL,
  nlambda = 100,
  nlambda.blocks = 10,
  lambda.min.ratio = ifelse(nrow(x) < ncol(x), 0.01, 1e-04),
  lambda = NULL,
  intercept = TRUE,
  standardize = TRUE,
  dfmax = NULL,
  maxVarsBlock = 10,
  costErrorVal = 1,
  costErrorTrain = 0,
  costVars = 0,
  Threads = 0,
  timeLimit = 1e+75,
  verbose = F,
  seed = NULL,
  warmstart = F,
  TH_IC = 0.001,
  indexSelected = NULL
)

Arguments

x

Input matrix, of dimension 'n' x 'p'. This is the data from the training partition. Its recommended to be class "matrix".

y

Response variable for the training dataset. A matrix of one column or a vector, with 'n' elements.

xval

Input matrix, of dimension 'n' x 'p'. This is the data from the validation partition. Its recommended to be class "matrix".

yval

Response variable for the validation dataset. A matrix of one column or a vector, with 'n' elements.

IC

information criterion to be used. Default is 'eBIC'.

IC.blocks

information criterion to be used in the block strategy. Default is the same as IC, but eBIC uses BIC for the block strategy.

nlambda

The number of lambda values. Default is 100.

nlambda.blocks

The number of lambda values in the block strategy part. Default is 10.

lambda.min.ratio

Smallest value for lambda, as a fraction of lambda.max, the (data derived) entry value.

lambda

A user supplied lambda sequence. Typical usage is to have the program compute its own lambda sequence based on nlambda and lambda.min.ratio. Supplying a value of lambda overrides this. WARNING: use with care.

intercept

Boolean variable to indicate if intercept should be added or not. Default is false.

standardize

Boolean variable to indicate if data should be scaled according to mean(x) mean(y) and sd(x) or not. Default is false.

dfmax

Maximum number of variables to be included in the problem. The intercept is not included in this number. Default is min(p,n).

maxVarsBlock

maximum number of variables in the block strategy.

costErrorVal

Cost of error of the validation set in the objective function. Default is 1. WARNING: use with care, changing this value changes the formulation presented in the main article.

costErrorTrain

Cost of error of the training set in the objective function. Default is 0. WARNING: use with care, changing this value changes the formulation presented in the main article.

costVars

Cost of new variables in the objective function. Default is 0. WARNING: use with care, changing this value changes the formulation presented in the main article.

Threads

CPLEX parameter, number of cores that CPLEX is allowed to use. Default is 0 (automatic).

timeLimit

CPLEX parameter, time limit per problem provided to CPLEX. Default is 1e75 (infinite time).

verbose

print progress, different levels: 1) print simple progress. 2) print result of blocks. 3) print each k in blocks Default is FALSE.

seed

set seed for random number generator for the block strategy. Default is system default.

warmstart

warmstart for CPLEX or use a different problem for each k. Default is False.

TH_IC

is the ratio over one that the information criterion must increase to be STOP. Default is 1e-3.

indexSelected

array of pre-selected variables. WARNING: debug feature.

Details

Compute the BOSO for use one block. This function calls cplexAPI to solve the optimization problem

Value

A 'BOSO' object which contains the following information:

betas

estimated betas

x

trianing x set used in BOSO (input parameter)

y

trianing x set used in BOSO (input parameter)

xval

validation x set used in BOSO (input parameter)

yval

validation x set used in BOSO (input parameter)

nlambda

nlambda used by 'BOSO' (input parameter)

intercept

if 'BOSO' has used intercept (input parameter)

standardize

if 'BOSO' has used standardization (input parameter)

mx

Mean value of each variable. 0 if data has not been standarized

sx

Standard deviation value of each variable. 0 if data has not been standarized

my

Mean value of output variable. 0 if data has not been standarized

dfmax

Maximum number of variables set to be used by 'BOSO' (input parameter)

result.final

list with the results of the final problem for each K

errorTrain

error in training set in the final problem

errorVal

error in Validation set in the final problem of used by

lambda.selected

lambda selected in the final problem of

p

number of initial variables

n

number of events in the training set

nval

number of events in the validation set

blockStrategy

index of variables which were stored in each iteration by 'BOSO' in the block strategy

Author(s)

Luis V. Valcarcel

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
  #This first example is a basic 
  #example of how to execute BOSO
  
  
  data("sim.xy", package = "BOSO")
  obj <- BOSO(x = sim.xy[['low']]$x,
              y = sim.xy[['low']]$y,
              xval = sim.xy[['low']]$xval,
              yval = sim.xy[['low']]$yval,
              IC = 'eBIC',
              nlambda=50,
              intercept= 0, standardize = 0,
              Threads=1, verbose = 3, seed = 2021)
  coef(obj)  # extract coefficients at a single value of lambda
  predict(obj, newx = sim.xy[['low']]$x[1:20, ])  # make predictions
  

BOSO documentation built on July 1, 2021, 9:08 a.m.