Description Usage Arguments Details Value Author(s) References See Also Examples
This function obtains boostrap estimates of performance metrics for a given predictive task and method to solve it (i.e. a workflow). The function is general in the sense that the workflow function that the user provides as the solution to the task, can implement or call whatever modeling technique the user wants.
The function implements both e0 boostrap estimates as well as .632
boostrap. The selection of the type of boostrap is done through the
estTask
argument (check the help page of
Bootstrap
).
Please note that most of the times you will not call this function
directly, though there is nothing wrong in doing it, but instead you
will use the function performanceEstimation
, that allows you to
carry out performance estimation for multiple workflows on multiple tasks,
using some estimation method like for instance boostrap. Still, when you
simply want to have the boostrap estimate for one workflow on one task,
you may use this function directly.
1 | bootEstimates(wf,task,estTask,cluster)
|
wf |
an object of the class |
task |
an object of the class |
estTask |
an object of the class |
cluster |
an optional parameter that can either be |
The idea of this function is to carry out a bootstrap experiment with the goal of obtaining reliable estimates of the predictive performance of a certain modeling approach (denoted here as a workflow) on a given predictive task. Two types of bootstrap estimates are implemented: i) e0 bootstrap and ii) .632 bootstrap. Bootstrap estimates are obtained by averaging over a set of k scores each obtained in the following way: i) draw a random sample with replacement with the same size as the original data set; ii) obtain a model with this sample; iii) test it and obtain the estimates for this run on the observations of the original data set that were not used in the sample obtained in step i). This process is repeated k times and the average scores are the bootstrap estimates. The main difference between e0 and .632 bootstrap is the fact that the latter tries to integrate the e0 estimate with the resubstitution estimate, i.e. when the model is learned and tested on the full available data sample.
Parallel execution of the estimation experiment is only recommended for minimally large data sets otherwise you may actually increase the computation time due to communication costs between the processes.
The result of the function is an object of class EstimationResults
.
Luis Torgo ltorgo@dcc.fc.up.pt
Torgo, L. (2014) An Infra-Structure for Performance Estimation and Experimental Comparison of Predictive Models in R. arXiv:1412.0436 [cs.MS] http://arxiv.org/abs/1412.0436
Bootstrap
,
Workflow
,
standardWF
,
PredTask
,
EstimationTask
,
performanceEstimation
,
hldEstimates
,
loocvEstimates
,
cvEstimates
,
mcEstimates
,
EstimationResults
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 | ## Not run:
## Estimating the MSE of a SVM variant on the
## swiss data, using 50 repetitions of .632 bootstrap
library(e1071)
data(swiss)
## running the estimation experiment
res <- bootEstimates(
Workflow(wfID="svmC10G01",
learner="svm",learner.pars=list(cost=10,gamma=0.1)
),
PredTask(Infant.Mortality ~ .,swiss),
EstimationTask("mse",method=Bootstrap(type=".632",nReps=50))
)
## Check a summary of the results
summary(res)
## End(Not run)
|
Task for estimating mse using
50 repetitions of .632 Bootstrap experiment
Run with seed = 1234
Iteration : 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50
*** Summary of a Bootstrap Estimation Experiment ***
Task for estimating mse using
50 repetitions of .632 Bootstrap experiment
Run with seed = 1234
* Predictive Task ID :: swiss.Infant.Mortality
Task Type :: regression
Target Feature :: Infant.Mortality
Formula :: Infant.Mortality ~ .
Task Data Source :: swiss
* Workflow ID :: svmC10G01
Workflow Function :: standardWF
Parameter values:
learner -> svm
learner.pars -> cost=10 gamma=0.1
* Summary of Score Estimation Results:
mse
avg 7.309047
std 2.228300
med 7.267532
iqr 3.471304
min 3.461347
max 13.184945
invalid 0.000000
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.