getLSTMmodel: getLSTMmodel

Description Usage Details Value See Also Examples

Description

Constructs a custom mxLSTM model for use in the caret train logic. It behaves slightly different than the usual caret models as retrieved by getModelInfo. See details.

Usage

1

Details

model setup
The model is an LSTM recurrent neural network model with rmsprop optimizer.

Purpose
The purpose of the custom model is the following:

Allow multiple y

Allow a regression within caret that predicts multiple y in one model.

Scaling of y

Allow for scaling of y. Possible options are c('scale', 'center', 'minMax')

scale x variables again

If e.g. a PCA is conducted in the preprocessing, the resulting inputs can be scaled again by preProcessing options c('scaleAgain', 'centerAgain')

Usage
The model differs from 'usual' caret models in its usage. Differences when using it in train:

Different formula for model specification

Usually, the formula would be for example y1+y2+y3 ~ x1+x2+x3. Caret does not allow this specification, therefore a hack is used:

  • construct a column dummy = y1

  • Specify the formula as dummy~x1+x2+x3+y1+y2+y3.

  • Determine x and y variables with the arguments xVariables = c('x1', 'x2', 'x3') and yVariables = c('y1','y2','y3')

Different pre-processing arguments
  • Don't us the caret preProcess argument. Use preProcessX and preProcessY instead

  • Don't specify preProcOptions in the trainControl call. Specify them in the call to train. They will be valid for preProcessX only since y pre-processing does not require further arguments. preProcessX can be anything that caret allows plus c('scaleAgain', 'centerAgain') for scaling as a last preProcessing step. preProcessY can include c('scale', 'center', 'minMax').

Additional mandatory arguments to fit function

For transforming the input to the LSTM, the following additional arguments must be specified to the train function:

Additional argumets to fit function:

Additional argumets to predict function:

\item

Different prediction functionFor predicting from the model as returned by caret's train, you have to use the predictAll function. This will call the internal predict function of getLSTMmodel returning predictions for all y-variables.

tuning parameters

Other specific features

Value

A list of functions similar to the output of caret's getModelInfo:

See Also

saveCaretLstmModel, loadCaretLstmModel, plot_trainHistory, fitLSTMmodel, predictLSTMmodel, getPreProcessor, predictPreProcessor, invertPreProcessor

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
## Not run: 
library(mxLSTM)
library(data.table)
library(caret)
###########################################################
## perform a regression with nxLSTM
## on dummy data

## simple data: one numeric output as a function of two numeric inputs.
## including lag values
## with some noise.
set.seed(200)
mx.set.seed(200)
nObs <- 20000
dat <- data.table(x = runif(n = nObs, min = 1000, max = 2000),
                  y = runif(n = nObs, min = -10, max = 10))
## create target
dat[, target := 0.5 * x + 0.7 * lag(y, 3) - 0.2 * lag(x, 5)]
dat[, target2 := 0.1 * x + 0.3 * lag(y, 1) - 0.4 * lag(x, 2)]
dat[, target := target + rnorm(nObs, 0, 10)]
dat[, target2 := target2 + rnorm(nObs, 0, 10)]

## convert to nxLSTM input
dat <- transformLSTMinput(dat = dat, targetColumn = c("target", "target2"), seq.length = 5)

## convert to caret input
dat <- lstmInput2caret(dat)

## split into train and test set
trainIdx <- sample(seq_len(nrow(dat)), as.integer(nrow(dat) / 3))
evalIdx  <- sample(seq_len(nrow(dat))[-trainIdx], as.integer(nrow(dat) / 3))
testIdx  <- seq_len(nrow(dat))[-c(trainIdx, evalIdx)]
datTrain <- dat[trainIdx,]
datEval  <- dat[evalIdx,]
datTest  <- dat[testIdx,]

## define caret trainControl
thisTrainControl  <- trainControl(method = "cv",
                                  number = 2,
                                  verboseIter = TRUE)


## do the training

## grid for defining the parameters of the mxNet model
lstmGrid <- expand.grid(layer1 = 64, layer2 = 0, layer3 = 0,
                        weight.decay = 0, dropout1 = 0, dropout2 = 0, dropout3 = 0,
                        learningrate.momentum = 0.95,
                        momentum = 0.1, num.epoch = 50,
                        batch.size = 128, activation = "relu", shuffle = TRUE, stringsAsFactors = FALSE)

## construct formula with all variables on rigth-hand-side
form <- formula(paste0("dummy~", paste0(setdiff(names(datTrain), "dummy"), collapse = "+")))

caret_lstm <- train(form = form,
                    data = datTrain,
                    testData = datEval,
                    method = getLSTMmodel(), ## get our custom model
                    xVariables = c("x", "y"), ## define predictors
                    yVariables = c("target", "target2"), ## define outcomes
                    preProcessX = c("pca", "scaleAgain", "centerAgain"),
                    preProcessY = c("scale", "center"), ## in case of multiple y, this makes sense imho
                    debugModel = FALSE,
                    trControl = thisTrainControl,
                    tuneGrid = lstmGrid,
                    learning.rate = c("1" = 0.02, "40" = 0.0002), ## adaptive learningrate that changes at epoch 40
                    optimizeFullSequence = FALSE
)

## get nice output of training history
plot_trainHistory(caret_lstm$finalModel)

## get predictions for the datasets
predTrain <- predictAll(caret_lstm, newdata = datTrain, fullSequence = FALSE)
predEval  <- predictAll(caret_lstm, newdata = datEval, fullSequence = FALSE)
predTest  <- predictAll(caret_lstm, newdata = datTest, fullSequence = FALSE)

## get nice goodness of fit plots.
plot_goodnessOfFit(predicted = predTrain$target, observed = datTrain$target_seq5Y)
plot_goodnessOfFit(predicted = predTrain$target2, observed = datTrain$target2_seq5Y)
plot_goodnessOfFit(predicted = predTest$target, observed = datTest$target_seq5Y)
plot_goodnessOfFit(predicted = predTest$target2, observed = datTest$target2_seq5Y)

## save the model
saveCaretLstmModel(caret_lstm, "testModel")

## End(Not run)

MarkusBonsch/mxLSTM documentation built on May 28, 2019, 6:40 a.m.