mpTune: Model and parameter simultaneous tuning

Description Usage Arguments Value Examples

View source: R/mpTune.R

Description

Tuning multiple models and hyper-parameter using grid search or randomized search, in a optional parallel fashion.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
mpTune(x, ...)

## S3 method for class 'formula'
mpTune(formula, data, weights = NULL, ...)

## Default S3 method:
mpTune(x, y, weights = NULL, models = list("rf", "gbm"),
  modelControl = list(), preProcess = NULL, gridLength = 5,
  randomizedLength = 20, mpTnControl = mpTuneControl(),
  loopingRule = foreachLoop, verbose = TRUE, ...)

Arguments

x

Design matrix, usually derived from model.matrix.

...

For mpTune.formula, this is the arguments passed to muTune.matrix For mpTune.mpTune, it is 'modelControl', 'verbose', 'test'

formula

A model formula simliar to formula in lm

data

A data frame contains for the training data

weights

Sample weigth

y

Response vector, numeric (regression), factor (classification) or 'Surv' object

models

A vector of names from models in models.RData

modelControl

A list of named (same as in models) lists contains additional parameters (cannot be tuning paramters) If one needs the arguments to be evaluated when inside the inner tuning loop, it should be quoted. For example, to use a balanced randomForest, one should would specify it like sampsize = quote(rep(min(table(y)), 2)), where the y will be evaluated in the resampling loop, instead of being evaluated in the environment where mpTune is called

preProcess

A preProcess object from caret

gridLength

Grid length if using grid search for hyper-parameters (tuning parameters)

randomizedLength

Number of hyper-parameter configuration if randomizedSearch is available for models

mpTnControl

A list generated by function mpTuneControl

loopingRule

A function that actually does the iteration, looked like function(executeFunciton, loopList, ...), similar to base::Map. One can use it to pass a customized parallel method. Built-in choice are foreachLoop (default), mclapplyLoop, mclapplyBatchLoop, parLapplyLoop and parLapplyLBLoop (balanced load parallel::paraLapply), parLapplyBLBatchLoop, and the non-paralled lapplyLoop. See ?loopingRule for detail on these functions

verbose

Should fitting message be displayed if any

Value

a list with class 'mpTune' or 'mpTune.formula' (inherits of 'mpTune') containing the following entries

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
## Not run: 

if (require(doMC) && detectCores() > 2) {
    registerDoMC(cores = detectCores());
    }

if (require(mlbench)) {
    data(Sonar, package = 'mlbench');
    inTraining <- sample(1:nrow(Sonar), floor(nrow(Sonar)*0.6), replace = TRUE);
    training   <- Sonar[inTraining, ];
    testing    <- Sonar[-inTraining, ];

    sonarTuned <- mpTune(
        formula = Class ~. ,
        data = training,
        models =  list(balancedRF = 'rf', rf = 'rf', 'gbm'),
        mpTnControl = mpTuneControl(
            samplingFunction = createCVFolds, nfold = 3, repeats = 1,
            stratify = TRUE, classProbs = TRUE,
            summaryFunction = requireSummary(metric = c('AUC', 'BAC', 'Kappa'))),
        gridLength = 3,
        randomizedLength = 3,
        modelControl = list(
            gbm = list(verbose = FALSE),
            balancedRF = list(ntree = 100, sampsize = quote(rep(min(table(y)), 2)))
            )
        );

    print(sonarTuned);
    print(summary(sonarTuned));

    # tune more model
    sonarTuned <- more(sonarTuned, models = 'glmnet');

    # Now sonarTuned contains tuning information of four models: balancedRF, rf, gbm and glmnet
    # fit the model giving the best 'AUC'
    bestModel <- fit(sonarTuned, metric = 'AUC')
    print(bestModel);

    # predict on hold out sample
    # sonarTestPred <- predict(bestModel, newdata = testing);

    # perform a cross validation for a fair performance estimate cosidering multiple model tunings and selections
    sonarTunedPerf <- resample(sonarTuned, nfold = 3, repeats = 1, stratify = TRUE);
    print(sonarTunedPerf);
    }

##
## Survival analysis
##

# check what models are avaible for right censored survival data
print(getDefaultModel(type = 'survival'))

if (require(randomForestSRC)) {
    data(pbc, package = 'randomForestSRC');
    pbc <- na.omit(pbc);
    pbc <- pbc[sample(nrow(pbc), 100), ];

    survTune <- mpTune(
        Surv(days, status) ~.,
        data = pbc,
        models = list(
            Cox = 'coxph',
            elasticnet = 'glmnet',
            gbm = 'gbm',
            survivalForest = 'rfsrc',
            boostedSCI = 'glmboost'
            ),
        mpTnControl = mpTuneControl(
            samplingFunction = createCVFolds,nfold = 3, repeats = 1,
            stratify = TRUE, summaryFunction = survivalSummary),
        modelControl = list(
            boostedSCI = list(family = SCI()),
            gbm = list(verbose = FALSE)
            ),
        gridLength = 2,
        randomizedLength = 3
        );

    print(survTune);
    summary(survTune, metric = 'C-index');
    }


## End(Not run)

linxihui/lazyML documentation built on May 21, 2019, 6:39 a.m.