unbiasedRun: Perform unbiased runs with best-solution parameters.

Description Usage Arguments Value Note Author(s) See Also Examples

View source: R/unbiasedRun.r

Description

Read the best solution of a parameter-tuning run from envT$bst, execute with these best parameters the function tdm$mainFunc (usually a classification or regression machine learning task), to see whether the result quality is reproducible on independent test data or on independently trained models.

Usage

1
2
3
4
5
6
7
8
unbiasedRun(
  confFile,
  envT,
  dataObj = NULL,
  umode = "RSUB",
  withParams = FALSE,
  tdm = NULL
)

Arguments

confFile

the configuration name, e.g. "appAcid_02.conf"

envT

environment, from which we need the objects

bst

data frame containing best results (merged over repeats)

res

data frame containing all results

theTuner

["spot"] string

spotConfig

[NULL] a list with SPOT settings. If NULL, try to read spotConfig from confFile.

finals

[NULL] a one-row data frame to which new columns with final results are added

dataObj

[NULL] contains the pre-fetched data with training-set and test-set part. If NULL, set it to tdmReadAndSplit(opts,tdm).
It is now deprecated to have dataObj==NULL.

umode

— deprecated as argument to unbiasedRun — , use the division provided in dataObj = tdmReadAndSplit(opts,tdm) which makes use of tdm$umode.
For downward compatibility only (if dataObj==NULL :
[ "RSUB" (default) | "CV" | "TST" | "SP_T" ], how to divide in training and test data for the unbiased runs:

"RSUB"

random subsampling into (1-tdm$TST.testFrac)% training and tdm$TST.testFrac% test data

"CV"

cross validation (CV) with tdm$nrun folds

"TST"

all data in opts$filename (or dsetTrnVa(dataObj)) are used for training, all data in opts$filetest (or dsetTest(dataObj) are used for testing

"SP_T"

'split_test': prior to tuning, the data set was split by random subsampling into tdm$TST.testFrac% test and (1-tdm$TST.testFrac)% training-vali data, tagged via column "tdmSplit". Tuning was done on training-vali data. Now we use column "tdmSplit" to select the test data for unbiased evaluation. Training during unbiased evaluation is done on a fraction tdm$TST.trnFrac of the training-vali data

withParams

[FALSE] if =TRUE, add columns with best parameters to data frame finals (should be FALSE, if different runs have different parameters)

tdm

a list with TDM settings from which we use here the elements

mainFunc

the function to be called for unbiased evaluations

mainFile

change to the directory of mainFile before starting mainFunc

nrun

[5] how often to call the unbiased evaluation

nfold

[10] how many folds in CV (only relevant for umode="CV")

TST.testFrac

[0.2] test set fraction (only relevant for umode="RSUB" or ="SP_T")

The defaults in '[...]' are set by tdmDefaultsFill, if they are not defined on input.

Value

envT the augmdented environment envT, with the following items updated

finals

the final results

tdm

the updated list with TDM settings

results

last results (from last unbiased training)

Note

Side Effects: The list result, an object of class TDMclassifier or TDMregressor as returned from tdm$mainFunc is written onto envT$result.
If envT$spotConfig is NULL, it is constructed from confFile.
spotConfig$opts (list with all parameter settings for the DM task) has to be non-NULL.

Author(s)

Wolfgang Konen, THK, 2013 - 2018

If envT$bst or envT$res is NULL, try to read it from the file (the filename is inferred envT$spotConfig. If this is NULL, it is constructed from confFile). We try to find the files for envT$bst or envT$res in dir envT$theTuner).

See Also

tdmBigLoop, TDMclassifier, TDMregressor

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
   ## Load the best results obtained in a prior tuning for the configuration "sonar_04.conf"
   ## with tuning method "spot". The result envT from a prior run of tdmBigLoop with this .conf
   ## is read from demo02sonar/demoSonar.RData.
   ## Run task main_sonar again with these best parameters, using the default settings from 
   ## tdmDefaultsFill: umode="RSUB", tdm$nrun=5  and tdm$TST.testFrac=0.2.
   path = paste(find.package("TDMR"), "demo02sonar",sep="/")
   envT = tdmEnvTLoad("demoSonar.RData",path);    # loads envT
   source(paste(path,"main_sonar.r",sep="/"));
   envT$tdm$optsVerbosity=1;
   envT$sCList[[1]]$opts$path=path;       # overwrite a possibly older stored path
   envT$spotConfig <- envT$sCList[[1]];
   dataObj <- tdmReadTaskData(envT,envT$tdm);
   envT <- unbiasedRun("sonar_04.conf",envT,dataObj,tdm=envT$tdm);
   print(envT$finals);

TDMR documentation built on March 3, 2020, 1:06 a.m.