tune: Tune parameters of a model-building method

Description Usage Arguments Details Value Examples

View source: R/tune.R

Description

tune evaluates model performance on a combination of parameters. The methods available are the same as [eval_model()].

Passing method = "mars" or method = "earth" tunes a MARS model using the function [earth::earth()].

Passing method = "glm" or method = "glmnet" tunes a GLM using the function [glmnet::glmnet()].

Passing method = "rf" tunes the function [randomForest::randomForest()].

For all SVM methods, the function is tuned on [e1071::svm()] and assumes that the SVM type being used for model-building is "eps-regression". This assumes that the response variable being passed to the function is numeric. The list of parameters to tune can be found in documentation for the function ?e1071::svm. The methods "svm_linear", "svm_polynomial", "svm_radial", and "svm_sigmoid" are separated because each SVM kernel can take different combinations of parameters to tune.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
tune(method, ...)

# To call MARS (methods are identical)
  tune(method = "earth", df, resp, nfold = 10, nrep = 1, ...)
  tune(method = "mars", df, resp, nfold = 10, nrep = 1, ...)

# To call GLM (methods are identical)
  tune(method = "glm", df, resp, nfold = 10, nrep = 1, ...)
  tune(method = "glmnet", df, resp, nfold = 10, nrep = 1, ...)

tune(method = "rf", df, resp, nfold = 10, nrep = 1, ...)

tune(method = "svm_linear", df, resp, nfold = 10, nrep = 1, ...)

tune(method = "svm_polynomial", df, resp, nfold = 10, nrep = 1, ...)

tune(method = "svm_radial", df, resp, nfold = 10, nrep = 1, ...)

tune(method = "svm_sigmoid", df, resp, nfold = 10, nrep = 1, ...)

Arguments

method

The model-building method. Should be "rf" at this point.

...

Additional arguments to be passed to model-building. This will likely be vectors of the values of the parameters to test.

df

The data frame to train on

resp

The name of the column containing the response variable

nfold

The number of folds to use in evaluation. Default is 10.

nrep

The number of repetitions to use in evaluation. Default is 1.

ignore_col

Columns to ignore during model-building. Default is NA.

Details

Calling print on a "tune" object provides details on the model type and the model performance.

Calling predict on a "tune" object runs prediction using the class of the model stored in the object.

There are many parameters to tune "earth". Likely the most useful ones will be fast.k, fast.beta, newvar.penalty, penalty, minspan, and degree. If time allows, earth can do more thorough variable selection with different pruning methods and cross-validation.

There are many parameters to tune the GLM models. Likely the most useful ones will be alpha, nlambda, dfmax, pmax, and family.

An alpha value of alpha = 1 uses lasso penalty. An alpha = 0 uses ridge penalty.

Possible parameters to tune "rf" are mtry, replace, sampsize, nodesize, and maxnodes.

Possible parameters to tune "svm_linear" include cost, tolerance, and epsilon.

Possible parameters to tune "svm_polynomial" include degree, gamma, coef0, cost, tolerance, and epsilon.

Possible parameters to tune "svm_radial" include gamma, cost, tolerance, and epsilon.

Possible parameters to tune "svm_sigmoid" include gamma, coef0, cost, tolerance, and epsilon.

Value

An object of the S3 class "tune". Includes a list of the model with the best performing parameters.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
# Using "mars" or "earth" as the method
tune(
  method = "earth", df = your_data, resp = "y",
  nfold = 10, nrep = 10,
  fast.k = c(0, 5, 10, 20),
  fast.beta = c(0, 1),
  newvar.penalty = c(0, 0.01, 0.1, 0.2, 0.25),
  penalty = c(2, 3, 4),
  minspan = c(0, 1, 4, 10)
  degree = c(1, 2, 3)
)
# Using "mars" or "earth" as the method
tune(
  method = "earth", df = your_data, resp = "y",
  nfold = 10, nrep = 10,
  alpha = seq(0, 1, by = 0.2),
  fast.k = c(0, 5, 10, 20),
  nlambda = c(20, 50, 100, 200),
  dfmax = c(10, 50, length(data) - 1),
  pmax = c(10, 50, 100, length(data) - 1)
)
# Using tune and "rf" (randomForest) as the method
tune(
  method = "rf", df = your_data, resp = "y",
  nfold = 10, nrep = 10,
  mtry = c(2, 4, 8, 14),
  replace = c(T, F),
  sampsize = c(10, 20, 30)
)
# Using "svm_linear" as the method
tune(
  method = "svm_linear", df = your_data, resp = "y",
  nfold = 10, nrep = 10,
  cost = c(0, 0.1, 0.25, 0.5, 1),
  epsilon = c(0, , 0.05, 0.1, 0.5, 1),
)

awqx/qsarr documentation built on Oct. 2, 2021, 7:05 a.m.

Related to tune in awqx/qsarr...