Description Usage Arguments Value Examples
Tuning multiple models and hyper-parameter using grid search or randomized search, in a optional parallel fashion.
1 2 3 4 5 6 7 8 9 10 | mpTune(x, ...)
## S3 method for class 'formula'
mpTune(formula, data, weights = NULL, ...)
## Default S3 method:
mpTune(x, y, weights = NULL, models = list("rf", "gbm"),
modelControl = list(), preProcess = NULL, gridLength = 5,
randomizedLength = 20, mpTnControl = mpTuneControl(),
loopingRule = foreachLoop, verbose = TRUE, ...)
|
x |
Design matrix, usually derived from model.matrix. |
... |
For mpTune.formula, this is the arguments passed to muTune.matrix For mpTune.mpTune, it is 'modelControl', 'verbose', 'test' |
formula |
A model formula simliar to formula in |
data |
A data frame contains for the training data |
weights |
Sample weigth |
y |
Response vector, numeric (regression), factor (classification) or 'Surv' object |
models |
A vector of names from |
modelControl |
A list of named (same as in |
preProcess |
A preProcess object from |
gridLength |
Grid length if using grid search for hyper-parameters (tuning parameters) |
randomizedLength |
Number of hyper-parameter configuration if randomizedSearch is available for models |
mpTnControl |
A list generated by function |
loopingRule |
A function that actually does the iteration, looked like function(executeFunciton, loopList, ...), similar to base::Map. One can use it to pass a customized parallel method. Built-in choice are foreachLoop (default), mclapplyLoop, mclapplyBatchLoop, parLapplyLoop and parLapplyLBLoop (balanced load parallel::paraLapply), parLapplyBLBatchLoop, and the non-paralled lapplyLoop. See ?loopingRule for detail on these functions |
verbose |
Should fitting message be displayed if any |
a list with class 'mpTune' or 'mpTune.formula' (inherits of 'mpTune') containing the following entries
allModelsPerformancea list of lists (length = number of models) of all tried model-parameter combinations, not ranked
allCVsa list of lists of all tried model-parameter-fold combinations, not ranked
sampleIndexa list of cross validation folds. Each fold is a validation or test set, and its complementary set is the training
datalist(x, y, weights) or list(formula, data, weights) mpTnControl$returnData if TRUE (default)
performanceMetricperformance matrix used, which is specified in mpTune$summaryFunction
configa list of
sampleIndexa list of resample used for tuning
modelsa list of models as specified
modelControla list of further arguments as specifed
mpTnControla 'mpTuneControl' object as specifed
preProcessa 'preProcess' object as specified
gridLengtha vector of gridLength as specifed
randomizedLengtha vector of randomized research as specifed
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 | ## Not run:
if (require(doMC) && detectCores() > 2) {
registerDoMC(cores = detectCores());
}
if (require(mlbench)) {
data(Sonar, package = 'mlbench');
inTraining <- sample(1:nrow(Sonar), floor(nrow(Sonar)*0.6), replace = TRUE);
training <- Sonar[inTraining, ];
testing <- Sonar[-inTraining, ];
sonarTuned <- mpTune(
formula = Class ~. ,
data = training,
models = list(balancedRF = 'rf', rf = 'rf', 'gbm'),
mpTnControl = mpTuneControl(
samplingFunction = createCVFolds, nfold = 3, repeats = 1,
stratify = TRUE, classProbs = TRUE,
summaryFunction = requireSummary(metric = c('AUC', 'BAC', 'Kappa'))),
gridLength = 3,
randomizedLength = 3,
modelControl = list(
gbm = list(verbose = FALSE),
balancedRF = list(ntree = 100, sampsize = quote(rep(min(table(y)), 2)))
)
);
print(sonarTuned);
print(summary(sonarTuned));
# tune more model
sonarTuned <- more(sonarTuned, models = 'glmnet');
# Now sonarTuned contains tuning information of four models: balancedRF, rf, gbm and glmnet
# fit the model giving the best 'AUC'
bestModel <- fit(sonarTuned, metric = 'AUC')
print(bestModel);
# predict on hold out sample
# sonarTestPred <- predict(bestModel, newdata = testing);
# perform a cross validation for a fair performance estimate cosidering multiple model tunings and selections
sonarTunedPerf <- resample(sonarTuned, nfold = 3, repeats = 1, stratify = TRUE);
print(sonarTunedPerf);
}
##
## Survival analysis
##
# check what models are avaible for right censored survival data
print(getDefaultModel(type = 'survival'))
if (require(randomForestSRC)) {
data(pbc, package = 'randomForestSRC');
pbc <- na.omit(pbc);
pbc <- pbc[sample(nrow(pbc), 100), ];
survTune <- mpTune(
Surv(days, status) ~.,
data = pbc,
models = list(
Cox = 'coxph',
elasticnet = 'glmnet',
gbm = 'gbm',
survivalForest = 'rfsrc',
boostedSCI = 'glmboost'
),
mpTnControl = mpTuneControl(
samplingFunction = createCVFolds,nfold = 3, repeats = 1,
stratify = TRUE, summaryFunction = survivalSummary),
modelControl = list(
boostedSCI = list(family = SCI()),
gbm = list(verbose = FALSE)
),
gridLength = 2,
randomizedLength = 3
);
print(survTune);
summary(survTune, metric = 'C-index');
}
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.