View source: R/buildUBaymodel.R
build.UBaymodel | R Documentation |
Build a data structure for UBayFS and train an ensemble of elementary feature selectors.
build.UBaymodel( data, target, M = 100, tt_split = 0.75, nr_features = "auto", method = "mRMR", prior_model = "dirichlet", weights = 1, constraints = NULL, lambda = 1, optim_method = "GA", popsize = 50, maxiter = 100, shiny = FALSE, ... )
data |
a matrix of input data |
target |
a vector of input labels; for binary problems a factor variable should be used |
M |
the number of elementary models to be trained in the ensemble |
tt_split |
the ratio of samples drawn for building an elementary model (train-test-split) |
nr_features |
number of features to select in each elementary model; if 'auto' a randomized number of features is used in each elementary model |
method |
a vector denoting the method(s) used as elementary models; options: 'mRMR', 'laplace' (Laplacian score) Also self-defined functions are possible methods; they must have the arguments X (data), y (target), n (number of features) and name (name of the function). For more details see examples. |
prior_model |
a string denoting the prior model to use; options: 'dirichlet', 'wong', 'hankin'; 'hankin' is the most general prior model, but also the most time consuming |
weights |
the vector of user-defined prior weights for each feature |
constraints |
a list containing a relaxed system 'Ax<=b' of user constraints, given as matrix 'A', vector 'b' and vector or scalar 'rho' (relaxation parameter). At least one max-size constraint must be contained. For details, see buildConstraints. |
lambda |
a positive scalar denoting the overall strength of the constraints |
optim_method |
the method to evaluate the posterior distribution. Currently, only the option 'GA' (genetic algorithm) is supported. |
popsize |
size of the initial population of the genetic algorithm for model optimization |
maxiter |
maximum number of iterations of the genetic algorithm for model optimization |
shiny |
TRUE indicates that the function is called from Shiny dashboard |
... |
additional arguments |
The function aggregates input parameters for UBayFS - including data, parameters defining ensemble and user knowledge and parameters specifying the optimization procedure - and trains the ensemble model.
a 'UBaymodel' object containing the following list elements:
'data' - the input dataset
'target' - the input target
'lambda' - the input lambda value (constraint strength)
'prior_model' - the chosen prior model
'ensemble.params' - information about input and output of ensemble feature selection
'constraint.params' - parameters representing the constraints
‘user.params' - parameters representing the user’s prior knowledge
'optim.params' - optimization parameters
# build a UBayFS model using Breast Cancer Wisconsin dataset data(bcw) # dataset c <- buildConstraints(constraint_types = 'max_size', constraint_vars = list(10), num_elements = ncol(bcw$data), rho = 1) # prior constraints w <- rep(1, ncol(bcw$data)) # weights model <- build.UBaymodel( data = bcw$data, target = bcw$labels, M = 20, constraints = c, weights = w ) # use a function computing a decision tree as input library('rpart') decision_tree <- function(X, y, n, name = 'tree'){ rf_data = as.data.frame(cbind(y, X)) colnames(rf_data) <- make.names(colnames(rf_data)) tree = rpart::rpart(y~., data = rf_data) return(list(ranks= which(colnames(X) %in% names(tree$variable.importance)[1:n]), name = name)) } model <- build.UBaymodel( data = bcw$data, target = bcw$labels, constraints = c, weights = w, method = decision_tree ) # include block-constraints c_block <- buildConstraints(constraint_types = 'max_size', constraint_vars = list(2), num_elements = length(bcw$blocks), rho = 10, block_list = bcw$blocks) model <- setConstraints(model, c_block)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.