Description Usage Arguments Details Value Note Author(s) References See Also Examples
Easy to use function that returns a list of searching (hyper)parameters for a particular model (classification or regression) or for a multiple list of models (automl or ensembles).
The result is to be put in a search
argument, used by fit
or mining
functions. Something
like:
search=list(search=mparheuristic(...),...)
.
1 2 |
model |
model type name. See
|
n |
number of searches or heuristic (either
|
lower |
lower bound for the (hyper)parameter (if |
upper |
upper bound for the (hyper)parameter (if |
by |
increment in the sequence (if |
exponential |
if an exponential scale should be used in the search sequence (the |
kernel |
optional kernel type, only used when |
task |
optional task argument, only used for uniform design ( |
inputs |
optional inputs argument: the number of inputs, only used by |
This function facilitates the definition of the search
argument used by fit
or mining
functions.
Using simple heuristics, reasonable (hyper)parameter search values are suggested for several rminer models. For models not
mapped in this function, the function returns NULL
, which means that no hyperparameter search is executed (often,
this implies using rminer or R function default values).
The simple usage of heuristic
assumes lower and upper bounds for a (hyper)parameter. If n=1
, then rminer or R defaults are assumed.
Else, a search is created using seq(lower,upper,by)
, where by
was set by the used or computed from n
.
For some model="ksvm"
setups, 2^seq(...)
is used for sigma and C, (1/10)^seq(...)
is used for scale.
Please check the resulting object to inspect the obtained final search values.
This function also allows to easily set multiple model searches, under the: "automl", "automl2", "automl3" or vector character options (see below examples).
A list with one ore more (hyper)parameter values to be searched.
See also http://hdl.handle.net/1822/36210 and http://www3.dsi.uminho.pt/pcortez/rminer.html
Paulo Cortez http://www3.dsi.uminho.pt/pcortez/
To check for more details about rminer and for citation purposes:
P. Cortez.
Data Mining with Neural Networks and Support Vector Machines Using the R/rminer Tool.
In P. Perner (Ed.), Advances in Data Mining - Applications and Theoretical Aspects 10th Industrial Conference on Data Mining (ICDM 2010), Lecture Notes in Artificial Intelligence 6171, pp. 572-583, Berlin, Germany, July, 2010. Springer. ISBN: 978-3-642-14399-1.
@Springer: https://link.springer.com/chapter/10.1007/978-3-642-14400-4_44
http://www3.dsi.uminho.pt/pcortez/2010-rminer.pdf
The automl is inspired in this work:
L. Ferreira, A. Pilastri, C. Martins, P. Santos, P. Cortez.
An Automated and Distributed Machine Learning Framework for Telecommunications Risk Management.
In J. van den Herik et al. (Eds.),
Proceedings of 12th International Conference on Agents and Artificial Intelligence – ICAART 2020, Volume 2, pp. 99-107,
Valletta, Malta, February, 2020, SCITEPRESS, ISBN 978-989-758-395-7.
@INSTICC: https://www.insticc.org/Primoris/Resources/PaperPdf.ashx?idPaper=89528
This tutorial shows additional code examples:
P. Cortez.
A tutorial on using the rminer R package for data mining tasks.
Teaching Report, Department of Information Systems, ALGORITMI Research Centre, Engineering School, University of Minho, Guimaraes,
Portugal, July 2015.
http://hdl.handle.net/1822/36210
Some lower/upper bounds and heuristics were retrieved from:
M. Fernandez-Delgado, E. Cernadas, S. Barro and D. Amorim.
Do we need hundreds of classifiers to solve real world classification problems?.
In The Journal of Machine Learning Research, 15(1), 3133-3181, 2014.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 | ## "kknn"
s=mparheuristic("kknn",n="heuristic")
print(s)
s=mparheuristic("kknn",n=1) # same thing
print(s)
s=mparheuristic("kknn",n="heuristic5")
print(s)
s=mparheuristic("kknn",n=5) # same thing
print(s)
s=mparheuristic("kknn",lower=5,upper=15,by=2)
print(s)
# exponential scale:
s=mparheuristic("kknn",lower=1,upper=5,by=1,exponential=2)
print(s)
## "mlpe"
s=mparheuristic("mlpe")
print(s) # "NA" means set size with min(inputs/2,10) in fit
s=mparheuristic("mlpe",n="heuristic10")
print(s)
s=mparheuristic("mlpe",n=10) # same thing
print(s)
s=mparheuristic("mlpe",n=10,lower=2,upper=20)
print(s)
## "randomForest", upper should be set to the number of inputs = max mtry
s=mparheuristic("randomForest",n=10,upper=6)
print(s)
## "ksvm"
s=mparheuristic("ksvm",n=10)
print(s)
s=mparheuristic("ksvm",n=10,kernel="vanilladot")
print(s)
s=mparheuristic("ksvm",n=10,kernel="polydot")
print(s)
## lssvm
s=mparheuristic("lssvm",n=10)
print(s)
## rvm
s=mparheuristic("rvm",n=5)
print(s)
s=mparheuristic("rvm",n=5,kernel="vanilladot")
print(s)
## "rpart" and "ctree" are special cases (see help(fit,package=rminer) examples):
s=mparheuristic("rpart",n=3) # 3 cp values
print(s)
s=mparheuristic("ctree",n=3) # 3 mincriterion values
print(s)
### examples with fit
## Not run:
### classification
data(iris)
# ksvm and rbfdot:
model="ksvm";kernel="rbfdot"
s=mparheuristic(model,n="heuristic5",kernel=kernel)
print(s) # 5 sigma values
search=list(search=s,method=c("holdout",2/3,123))
# task "prob" is assumed, optimization of "AUC":
M=fit(Species~.,data=iris,model=model,search=search,fdebug=TRUE)
print(M@mpar)
# different lower and upper range:
s=mparheuristic(model,n=5,kernel=kernel,lower=-5,upper=1)
print(s) # from 2^-5 to 2^1
search=list(search=s,method=c("holdout",2/3,123))
# task "prob" is assumed, optimization of "AUC":
M=fit(Species~.,data=iris,model=model,search=search,fdebug=TRUE)
print(M@mpar)
# different exponential scale:
s=mparheuristic(model,n=5,kernel=kernel,lower=-4,upper=0,exponential=10)
print(s) # from 10^-5 to 10^1
search=list(search=s,method=c("holdout",2/3,123))
# task "prob" is assumed, optimization of "AUC":
M=fit(Species~.,data=iris,model=model,search=search,fdebug=TRUE)
print(M@mpar)
# "lssvm" Gaussian model, pure classification and ACC optimization, full iris:
model="lssvm";kernel="rbfdot"
s=mparheuristic("lssvm",n=3,kernel=kernel)
print(s)
search=list(search=s,method=c("holdout",2/3,123))
M=fit(Species~.,data=iris,model=model,search=search,fdebug=TRUE)
print(M@mpar)
# test several heuristic5 searches, full iris:
n="heuristic5";inputs=ncol(iris)-1
model=c("ctree","rpart","kknn","ksvm","lssvm","mlpe","randomForest")
for(i in 1:length(model))
{
cat("--- i:",i,"model:",model[i],"\n")
if(model[i]=="randomForest") s=mparheuristic(model[i],n=n,upper=inputs)
else s=mparheuristic(model[i],n=n)
print(s)
search=list(search=s,method=c("holdout",2/3,123))
M=fit(Species~.,data=iris,model=model[i],search=search,fdebug=TRUE)
print(M@mpar)
}
# test several Delgado 2014 searches (some cases launch warnings):
model=c("mlp","mlpe","mlp","ksvm","ksvm","ksvm",
"ksvm","lssvm","rpart","rpart","ctree",
"ctree","randomForest","kknn","kknn","multinom")
n=c("mlp_t","avNNet_t","nnet_t","svm_C","svmRadial_t","svmLinear_t",
"svmPoly_t","lsvmRadial_t","rpart_t","rpart2_t","ctree_t",
"ctree2_t","rf_t","knn_R","knn_t","multinom_t")
inputs=ncol(iris)-1
for(i in 1:length(model))
{
cat("--- i:",i,"model:",model[i],"heuristic:",n[i],"\n")
if(model[i]=="randomForest") s=mparheuristic(model[i],n=n[i],upper=inputs)
else s=mparheuristic(model[i],n=n[i])
print(s)
search=list(search=s,method=c("holdout",2/3,123))
M=fit(Species~.,data=iris,model=model[i],search=search,fdebug=TRUE)
print(M@mpar)
}
## End(Not run) #dontrun
### regression
## Not run:
data(sa_ssin)
s=mparheuristic("ksvm",n=3,kernel="polydot")
print(s)
search=list(search=s,metric="MAE",method=c("holdout",2/3,123))
M=fit(y~.,data=sa_ssin,model="ksvm",search=search,fdebug=TRUE)
print(M@mpar)
# regression task, predict iris "Petal.Width":
data(iris)
ir2=iris[,1:4]
names(ir2)[ncol(ir2)]="y" # change output name
n=3;inputs=ncol(ir2)-1 # 3 hyperparameter searches
model=c("ctree","rpart","kknn","ksvm","mlpe","randomForest","rvm")
for(i in 1:length(model))
{
cat("--- i:",i,"model:",model[i],"\n")
if(model[i]=="randomForest") s=mparheuristic(model[i],n=n,upper=inputs)
else s=mparheuristic(model[i],n=n)
print(s)
search=list(search=s,method=c("holdout",2/3,123))
M=fit(y~.,data=ir2,model=model[i],search=search,fdebug=TRUE)
print(M@mpar)
}
## End(Not run) #dontrun
### multiple model examples:
## Not run:
data(iris)
inputs=ncol(iris)-1; task="prob"
# 5 machine learning (ML) algorithms, 1 heuristic hyperparameter per algorithm:
sm=mparheuristic(model="automl",task=task,inputs=inputs)
print(sm)
# 5 ML with 10/13 hyperparameter searches:
sm=mparheuristic(model="automl2",task=task,inputs=inputs)
# note: mtry only has 4 searches due to the inputs limit:
print(sm)
# regression example:
ir2=iris[,1:4]
inputs=ncol(ir2)-1; task="reg"
sm=mparheuristic(model="automl2",task=task,inputs=inputs)
# note: ksvm contains 3 UD hyperparameters (and not 2) since task="reg":
print(sm)
# 5 ML and stacking:
inputs=ncol(iris)-1; task="prob"
sm=mparheuristic(model="automl3",task=task,inputs=inputs)
# note: $ls only has 5 elements, one for each individual ML
print(sm)
# other manual design examples: --------------------------------------
# 5 ML and three ensembles:
# the fit or mining functions will search for the best option
# between any of the 5 ML algorithms and any of the three
# ensemble approaches:
sm2=mparheuristic(model="automl3",task=task,inputs=inputs)
# note: ensembles need to be at the end of the $models field:
sm2$models=c(sm2$models,"AE","WE") # add AE and WE
sm2$smethod=c(sm2$smethod,rep("grid",2)) # add grid to AE and WE
# note: $ls only has 5 elements, one for each individual ML
print(sm2)
# 3 ML example:
models=c("cv.glmnet","mlpe","ksvm") # just 3 models
# note: in rminer the default cv.glmnet does not have "hyperparameters"
# since the cv automatically sets lambda
n=c(NA,10,"UD") # 10 searches for mlpe and 13 for ksvm
sm3=mparheuristic(model=models,n=n)
# note: $ls only has 5 elements, one for each individual ML
print(sm3)
# usage in sm2 and sm3 for fit (see mining help for usages in mining):
method=c("holdout",2/3,123)
d=iris
names(d)[ncol(d)]="y" # change output name
s2=list(search=sm2,smethod="auto",method=method,metric="AUC",convex=0)
M2=fit(y~.,data=d,model="auto",search=s2,fdebug=TRUE)
s3=list(search=sm3,smethod="auto",method=method,metric="AUC",convex=0)
M3=fit(y~.,data=d,model="auto",search=s3,fdebug=TRUE)
# -------------------------------------------------------------------
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.