View source: R/SOptim_ClassificationFunctions.R
generateDefaultClassifierParams | R Documentation |
This is an auxiliary function used for generating a list of default parameters used for the available classification algorithms: Random Forests (RF), K-nearest neighbour (KNN), Flexible Discriminant Analysis (FDA), Support Vector Machines (SVM) and Generalized Boosted Model (GBM).
generateDefaultClassifierParams(x)
x |
The dataset used for classification (this is required to calculate some classifier hyperparameters based on the number of columns/variables in the data). |
A nested list object of class classificationParamList
with parameters for the available
algorithms, namely:
RF - Random Forest parameters:
mtry is equal to floor(sqrt(ncol(x)-2))
and defines the number of variables randomly sampled
as candidates at each split
ntree equals 250 (by default) and is the number of trees to grow
KNN - K-nearest neighbour parameters:
k is equal to 5 and is the number of neighbours considered
FDA - Flexible Discriminant Analysis with MDA-MARS parameters:
degree equals 1 defining an optional integer specifying maximum interaction degree
penalty is equal to 2 and sets an optional value specifying the cost per degree of freedom charge
prune is set to TRUE and defines an optional logical value specifying whether the model should be pruned in a backward stepwise fashion
SVM - Support Vector Machine (with radial-basis kernel) parameters:
gamma equals 1/(ncol(x)-2)
and sets the parameter needed for all kernels except linear
cost equal to 1 defines the cost of constraints violation - it is the 'C'-constant of the regularization term in the Lagrange formulation
probability is equal to TRUE and defines the output type
GBM - Generalized Boosted Modeling parameters:
n.trees set to 250 defining the total number of trees to fit
interaction.depth equal to 1 which defines he maximum depth of variable interactions. 1 implies an additive model, 2 implies a model with up to 2-way interactions, etc
shrinkage set to 0.01 is a shrinkage parameter applied to each tree in the expansion. Also known as the learning rate or step-size reduction.
bag.fraction set to 0.5 and equals the fraction of the training set observations randomly selected to propose the next tree in the expansion)
distribution set to bernoulli (if single-class) or multinomial (if multi-class) this parameter defines the applicable distribution used for classification
replaceDefaultClassificationParams
DF <- data.frame(SID=1:5, train=sample(0:1,5,replace=TRUE), Var_1=rnorm(5), Var_2=rnorm(5))
generateDefaultClassifierParams(DF)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.