chooseB | R Documentation |
chooseB
plots the proportion of times an explanatory variable is selected according to the number of iterations (B).
chooseB(
res.varselbest,
plotvar = NULL,
linewidth = 1,
linetype = "dotdash",
xlab = "B",
ylab = "Proportion",
nrow = 2,
ncol = 2,
graph = TRUE
)
res.varselbest |
an output from the varselbest function |
plotvar |
index of variables for which a curve is ploted |
linewidth |
a numerical value setting the widths of lines |
linetype |
what type of plot should be drawn |
xlab |
a title for the x axis |
ylab |
a title for the y axis |
nrow |
argument of gtable. Default value is 2. |
ncol |
argument of gtable. Default value is 2. |
graph |
a boolean. If FALSE, no graphics are ploted. Default value is TRUE |
varselbest
performs variable selection on random subsets of variables and, then, combines them to recover which explanatory variables are related to the response, following Bar-Hen and Audigier (2022) <doi:10.1080/00949655.2022.2070621>.
More precisely, the outline of the algorithm are as follows: let consider a random subset of sizeblock
among p variables.
Then, any selection variables scheme can be applied.
By resampling B
times, a sample of size sizeblock
among the p variables, we may count how many times a variable is considered as significantly related to the response and how many times it is not.
The number of iterations B
should be large so that the proportion of times a variable is selected becomes stable. chooseB
plots the values of proportion according to the number of iterations.
a list of matrices where each row corresponds to the vector of proportions (for all explanatory variables) obtained for a given value of B
Bar-Hen, A. and Audigier, V., An ensemble learning method for variable selection: application to high dimensional data and missing values, Journal of Statistical Computation and Simulation, <doi:10.1080/00949655.2022.2070621>, 2022.
varselbest
data(wine)
require(parallel)
ref <- wine$cult
nb.clust <- 3
wine.na<-wine
wine.na$cult <- NULL
wine.na <- prodna(wine.na)
nnodes <- 2 # Number of CPU cores for parallel computing
B <- 80 # Number of iterations for variable selection
# variable selection
res.varsel <- varselbest(data.na = wine.na,
listvar = "alco",
B = B,
nnodes = nnodes,
nb.clust = nb.clust,
graph = FALSE)
# convergence
res.chooseB <- chooseB(res.varsel)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.