fastboost: Fastboost

View source: R/autoboost.R

fastboostR Documentation

Fastboost

Description

All in one use of selectboost that avoids redondant fitting of distributions and saves some memory.

Usage

fastboost(
  X,
  Y,
  ncores = 4,
  group = group_func_1,
  func = lasso_msgps_AICc,
  corrfunc = "cor",
  use.parallel = FALSE,
  B = 100,
  step.num = 0.1,
  step.limit = "none",
  verbose = FALSE,
  step.scale = "quantile",
  normalize = TRUE,
  steps.seq = NULL,
  debug = FALSE,
  version = "lars",
  c0lim = TRUE,
  ...
)

Arguments

X

Numerical matrix. Matrix of the variables.

Y

Numerical vector or factor. Response vector.

ncores

Numerical value. Number of cores for parallel computing. Defaults to 4.

group

Function. The grouping function. Defaults to group_func_1.

func

Function. The variable selection function. Defaults to lasso_msgps_AICc.

corrfunc

Character value or function. Used to compute associations between the variables. Defaults to "cor".

use.parallel

Boolean. To use parallel computing (doMC) download the extended package from Github. Set to FALSE.

B

Numerical value. Number of resampled fits of the model. Defaults to 100.

step.num

Numerical value. Step value for the c0 sequence. Defaults to 0.1.

step.limit

Defaults to "none".

verbose

Boolean. Defaults to FALSE.

step.scale

Character value. How to compute the c0 sequence if not user-provided: either "quantile" or "linear", "zoom_l", "zoom_q" and "mixed". Defaults to "quantile".

normalize

Boolean. Shall the X matrix be centered and scaled? Defaults to TRUE.

steps.seq

Numeric vector. User provided sequence of c0 values to use. Defaults to NULL.

debug

Boolean value. If more results are required. Defaults to FALSE.

version

Character value. Passed to the boost.select function. Defaults to lars

c0lim

Boolean. Shall the c0=0 and c0=1 values be used? Defaults to TRUE

...

. Arguments passed to the variable selection function used in boost.apply.

Details

fastboost returns a numeric matrix. For each of the variable (column) and each of the c0 (row), the entry is proportion of times that the variable was selected among the B resampled fits of the model. Fitting to the same group of variables is only perfomed once (even if it occured for another value of c0), which greatly speeds up the algorithm. In order to limit memory usage, fastboost uses a compact way to save the group memberships, which is especially useful with community grouping function and fairly big datasets.

Value

A numeric matrix with attributes.

Author(s)

Frederic Bertrand, frederic.bertrand@utt.fr

References

selectBoost: a general algorithm to enhance the performance of variable selection methods in correlated datasets, Frédéric Bertrand, Ismaïl Aouadi, Nicolas Jung, Raphael Carapito, Laurent Vallat, Seiamak Bahram, Myriam Maumy-Bertrand, Bioinformatics, 2020. doi: 10.1093/bioinformatics/btaa855

See Also

boost, autoboost, plot.selectboost

Other Selectboost functions: autoboost(), boost, plot_selectboost_cascade, selectboost_cascade

Examples

set.seed(314)
xran=matrix(rnorm(75),15,5)
ybin=sample(0:1,15,replace=TRUE)
yran=rnorm(15)
set.seed(314)
#For quick test purpose, not meaningful, should be run with greater value of B
#and disabling parallel computing as well
res.fastboost <- fastboost(xran,yran,B=3,use.parallel=FALSE)


fastboost(xran,yran)
#Customize resampling levels
fastboost(xran,yran,steps.seq=c(.99,.95,.9),c0lim=FALSE)
fastboost(xran,yran,step.scale="mixed",c0lim=TRUE)
fastboost(xran,yran,step.scale="zoom_l",c0lim=FALSE)
fastboost(xran,yran,step.scale="zoom_l",step.num = c(1,.9,.01),c0lim=FALSE)
fastboost(xran,yran,step.scale="zoom_q",c0lim=FALSE)
fastboost(xran,yran,step.scale="linear",c0lim=TRUE)
fastboost(xran,yran,step.scale="quantile",c0lim=TRUE)

#Binary logistic regression
fastboost(xran,ybin,func=lasso_cv_glmnet_bin_min)



SelectBoost documentation built on Dec. 1, 2022, 1:27 a.m.