tdmRegressLoop: Core regression double loop returning a 'TDMregressor'...

Description Usage Arguments Value Author(s) See Also Examples

View source: R/tdmRegressLoop.r

Description

tdmRegressLoop contains a double loop (opts$NRUN and CV-folds) and calls tdmRegress. It is called by all R-functions main_*.
It returns an object of class TDMregressor.

Usage

1
tdmRegressLoop(dset, response.variables, input.variables, opts, tset = NULL)

Arguments

dset

the data frame for which cvi is needed

response.variables

name of column which carries the target variable - or - vector of names specifying multiple target columns (these columns are not used during prediction, only for training and for evaluating the predicted result)

input.variables

vector with names of input columns

opts

a list from which we need here the following entries

NRUN

number of runs (outer loop)

TST.SEED

=NULL: leave the random number seed as it is. =any value: set the random number seed to this value to get reproducible random numbers and thus reproducible training-test-set-selection. (only relevant in case TST.kind=="cv" or "rand") (see also MOD.SEED in tdmClassify)

TST.kind

how to create cvi, handed over to tdmModCreateCVindex. If TST.kind="col", then cvi is taken from dset[,opts$TST.col].

GD.RESTART

[TRUE] =TRUE/FALSE: do/don't restart graphic devices

GRAPHDEV

["non"| other ]

tset

[NULL] If not NULL, this is the test data set. If NULL, we are in tuning and the validation data set is build from dset according to the procedure prescribed in opts$TST.*.

Value

result, an object of class TDMregressor, this is a list with results, containing

opts

the res$opts from tdmRegress

lastRes

last run, last fold: result from tdmRegress

R_train

RMAE / RMSE on training set (vector of length NRUN), depending on opts$rgain.type=="rmae" or "rmse"

S_train

RMSE on training set (vector of length NRUN)

T_train

Theil's U for RMAE on training set (vector of length NRUN)

*_test

— similar, with test set instead of training set —

Err

a data frame with as many rows as opts$NRUN and columns = (rmae.trn, rmse.trn made.trn, rmae.theil.trn, ntrn, rmae.tst, rmse.tst, made.tst, rmae.theil.tst, ntst)

predictions

last run: data frame with dimensions [nrow(dset),length(response.variable)]. In case of CV, all validation set predictions (for each record in dset), in other cases mixed validation / train set predictions.

predictTest

predictions on the test set tset (NULL if tset==NULL )

Author(s)

Wolfgang Konen (wolfgang.konen@th-koeln.de), THK

See Also

tdmRegress, tdmClassifyLoop, tdmClassify

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
#*# --------- demo/demo00-1regress.r ---------
#*# This demo shows a simple data mining process (phase 1 of TDMR) for regression on
#*# dataset iris.
#*# The data mining process in tdmRegressLoop calls randomForest as the prediction model.
#*# It is called opts$NRUN=2 times with different random train-validation set splits.
#*# Therefore data frame result$Err has 2 rows.
#*#
opts=tdmOptsDefaultsSet()                       # set all defaults for data mining process
gdObj <- tdmGraAndLogInitialize(opts);          # init graphics and log file

data(iris)
response.variables="Petal.Length"                # names, not data (!)
input.variables=setdiff(names(iris),"Petal.Length")
opts$rgain.type="rmae"

result = tdmRegressLoop(iris,response.variables,input.variables,opts)

print(result$Err)

TDMR documentation built on March 3, 2020, 1:06 a.m.