Description Usage Arguments Details Value Author(s) See Also Examples
tdmClassify is called by tdmClassifyLoop
and returns an object of class tdmClass
.
It trains a model on training set d_train
and evaluates it on test set d_test
.
If this function is used for tuning, the test set d_test
plays the role of a validation set.
1 2 3 4 5 6 7 8 9 10 | tdmClassify(
d_train,
d_test,
d_dis,
d_preproc,
response.variables,
input.variables,
opts,
tsetStr = c("Validation", "validation")
)
|
d_train |
training set |
d_test |
validation set, same columns as training set |
d_dis |
'disregard set', i.e. everything what is neither train nor test. The model is applied to all records in d_dis (needed for active learning, see ssl_methods.r) |
d_preproc |
data used for preprocessing. May be NULL, if no preprocessing is done (opts$PRE.SFA=="none" and opts$PRE.PCA=="none"). If preprocessing is done, then d_preproc is usually all non-validation data. |
response.variables |
name of column which carries the target variable - or - vector of names specifying multiple target columns (these columns are not used during prediction, only for evaluation) |
input.variables |
vector with names of input columns |
opts |
additional parameters [defaults in brackets]
|
tsetStr |
[c("Validation", "validation")] |
Currently d_dis is allowed to be a 0-row data frame, but d_train and d_test must have at least one record.
res
, an object of class tdmClass
, this is a list containing
|
training set + predicted class column(s) |
|
test set + predicted class column(s) |
|
disregard set + predicted class column(s) |
|
list with evaluation measures, averaged over all response variables |
|
data frame with evaluation measures, one row for each response variable |
|
a list with evaluation info for training set (confusion matrix, gain, class errors, ...) |
|
a list with evaluation info for validation set (confusion matrix, gain, class errors, ...) |
|
the last model built (i.e. for the last response variable) |
|
a list with three probability matrices (row: records, col: classes) v_train, v_test, v_dis, if the model provides probabilities; NULL else. |
|
name of the colum where the prediction of the last model is appended to the datasets d_train, d_test and d_dis |
|
a list with two data frames Trn and Val. They contain at least a column IND.dset (index of each train / validation record into data frame dset). If the model has probabilities, then they contain in addition a column for each response variable with the prediction probabilities. |
|
parameter list from input, some default values might have been added |
The 9 evaluation measures in avgEVAL and allEVAL are
cerr.* (misclassification errror),
gain.* (total gain) and
rgain.* (relative gain, i.e. total gain divided by max. achievable gain in *)
where * = [trn | tst | tst2 ] stands for [ training set | test set | test set with special treatment ]
and the special treatment is either opts$test2.string = "no postproc" or = "default cutoff".
The five items lastCmTrain
, lastCmVali
, lastModel
, lastProbs
, lastPred
are
specific for the *last* model (the one built for the last response variable in the last run and last fold)
Wolfgang Konen, THK, 2013
print.tdmClass
tdmClassifyLoop
tdmRegressLoop
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 | #*# This demo shows a simple data mining process (phase 1 of TDMR) for classification on
#*# dataset iris.
#*# The data mining process in tdmClassify calls randomForest as the prediction model.
#*# It is called opts$NRUN=1 time with one random train-validation set splits.
#*# Therefore data frame res$allEval has one row
#*#
opts=tdmOptsDefaultsSet() # set all defaults for data mining process
gdObj <- tdmGraAndLogInitialize(opts); # init graphics and log file
data(iris)
response.variables="Species" # names, not data (!)
input.variables=setdiff(names(iris),"Species")
opts$NRUN=1
idx_train = sample(nrow(iris))[1:110]
d_train=iris[idx_train,]
d_vali=iris[-idx_train,]
d_dis=iris[numeric(0),]
res <- tdmClassify(d_train,d_vali,d_dis,NULL,response.variables,input.variables,opts)
cat("\n")
print(res$allEVAL)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.