Description Usage Arguments Value Examples
Run the smartML main function for automatic classifier algorithm selection, and hyper-parameter tuning.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
maxTime |
Float numeric of the maximum time budget for reading dataset, preprocessing, calculating meta-features, Algorithm Selection & hyper-parameter tuning process only in minutes(Excluding Model Interpretability) - This is applicable in case of Option = 2 only. |
directory |
String Character of the training dataset directory (SmartML accepts file formats arff/(csv with columns headers) ). |
testDirectory |
String Character of the testing dataset directory (SmartML accepts file formats arff/(csv with columns headers) ). |
classCol |
String Character of the name of the class label column in the dataset (default = 'class'). |
metric |
Metric of string character to be used in evaluation:
|
vRatio |
Float numeric of the validation set ratio that should be splitted out of the training set for the evaluation process (default = 0.1 –> 10%). |
preProcessF |
vector of string Character containing the name of the preprocessing algorithms (default = c('standardize', 'zv') –> no preprocessing):
|
featuresToPreProcess |
Vector of number of features to perform the feature preprocessing on - In case of empty vector, this means to include all features in the dataset file (default = c()) - This vector should be a subset of |
nComp |
Integer numeric of Number of components needed if either "pca" or "ica" feature preprocessors are needed. |
nModels |
Integer numeric representing the number of classifier algorithms that you want to select based on Meta-Learning and start to tune using Bayesian Optimization (default = 5). |
option |
Integer numeric representing either Classifier Algorithm Selection is needed only = 1 or Algorithm selection with its parameter tuning is required = 2 which is the default value. |
featureTypes |
Vector of either 'numerical' or 'categorical' representing the types of features in the dataset (default = c() –> any factor or character features will be considered as categorical otherwise numerical). |
interp |
Boolean representing if model interpretability (Feature Importance and Interaction) is needed or not (default = FALSE) This option will take more time budget if set to 1. |
missingOpr |
Boolean variable represents either use median/mode imputation for instances with missing values (FALSE) or apply imputation using "MICE" library which helps you imputing missing values with plausible data values that are drawn from a distribution specifically designed for each missing datapoint (TRUE). |
balance |
Boolean variable represents if SMOTE class balancing is required or not (default FALSE). |
List of Results
"option=1" - Choosen Classifier Algorithms Names clfs
with their parameters configurations params
, Training DataFrame TRData
, Test DataFrame TEData
in case of option=2
,
"option=2" - Best classifier algorithm name found clfs
with its parameters configuration params
, , Training DataFrame TRData
, Test DataFrame TEData
, model variable model
, predicted values on test set pred
, performance on TestingSet perf
, and Feature Importance interpret$featImp
/ Interaction interpret$Interact
plots in case of interpretability interp
= TRUE and chosen model is not knn.
1 2 3 4 5 6 7 | ## Not run:
autoRLearn(1, 'sampleDatasets/car/train.arff', \
'sampleDatasets/car/test.arff', option = 2, preProcessF = 'normalize')
result <- autoRLearn(10, 'sampleDatasets/shuttle/train.arff', 'sampleDatasets/shuttle/test.arff')
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.