Description Usage Arguments Value Author(s) See Also Examples
View source: R/tdmClassifyLoop.r
tdmClassifyLoop contains a double loop (opts$NRUN and CV-folds)
and calls tdmClassify. It is called by all classification R-functions main_*.
It splits - if tset is NULL - the data in dset into training and validation
data according to opts$TST.kind.
It returns an object of class TDMclassifier.
1 | tdmClassifyLoop(dset, response.variables, input.variables, opts, tset = NULL)
|
dset |
the data frame containing training and validation data. |
response.variables |
name of column which carries the target variable - or - vector of names specifying multiple target columns (these columns are not used during prediction, only for evaluation) |
input.variables |
vector with names of input columns |
opts |
a list from which we need here the following entries
|
tset |
[NULL] If not NULL, this is the test data set. If NULL, we are in tuning and the validation data
set is build from |
result, an object of class TDMclassifier, this is a list with results, containing
lastRes |
last run, last fold: result from |
C_train |
classification error on training set |
G_train |
gain on training set |
R_train |
relative gain on training set (percentage of max. gain on this set) |
*_vali |
— similar, with vali set instead of training set — |
*_vali2 |
— similar, with vali2 set instead of training set — |
Err |
a data frame with as many rows as opts$NRUN and 9 columns corresponding to the nine variables described above |
predictions |
last run: data frame with dimensions [nrow(dset),length(response.variable)]. In case of CV, all CV predictions (for each record in dset), in other cases mixed validation / train set predictions. |
predictTest |
predictions on the test set |
predProbList |
a list, |
Each performance measure C_*, G_*, R_* is a vector of length opts$NRUN. To be specific, C_train[i] is the
classification error on the training set from the i-th run. This error is mean(res$allEVAL$cerr.trn), i.e. the
mean of the classification errors from all response variables when res is the return value of tdmClassify.
In the case of cross validation, for each performance measure an additional averaging over all folds is done.
Wolfgang Konen (wolfgang.konen@th-koeln.de), THK
print.TDMclassifier, tdmClassify, tdmRegress, tdmRegressLoop
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 | #*# --------- demo/demo00-0classif.r ---------
#*# This demo shows a simple data mining process (phase 1 of TDMR) for classification on
#*# dataset iris.
#*# The data mining process in tdmClassifyLoop calls randomForest as the prediction model.
#*# It is called opts$NRUN=2 times with different random train-validation set splits.
#*# Therefore data frame result$Err has two rows
#*#
opts=tdmOptsDefaultsSet() # set all defaults for data mining process
opts$TST.SEED <- opts$MOD.SEED <- 5 # reproducible results
#opts$VERBOSE <- opts$SRF.verbose <- 0 # no printed outut
gdObj <- tdmGraAndLogInitialize(opts); # init graphics and log file
data(iris)
response.variables="Species" # names, not data (!)
input.variables=setdiff(names(iris),"Species")
result = tdmClassifyLoop(iris,response.variables,input.variables,opts)
print(result$Err)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.