FcLSTM: Forecasting with Long Short Term Memory Based on Recurrent...

View source: R/FcLSTM.R

FcLSTMR Documentation

Forecasting with Long Short Term Memory Based on Recurrent Neural Network [Hochreiter, 1997].

Description

In simple words the feedword network feeds input not only forward from layer to layer but also in a loop back to specific layers which is called recurrent. LSTM is an improvement in the case of 'vanishing gradients'.

The procedure of this method works as follows: We start out with centered and scaled time series data (not necessary: we just need a time series varying in the interval [-1,1]) provided from a numerical vector data (hence equidistant). Furthermore one has to set a forecast length forecast_length.

Usage

FcLSTM(DataVec, SplitAt, ForecastHorizon, 

Seasonality = 28, Scaled = TRUE, ErrorLoss = "MSE", Epochs = 100, 

Neurons = 28, ActivationFunction = "relu", RecurrentActivation = "sigmoid",

Batch_size = 1, Time, PlotIt = FALSE, Silent = TRUE,...)

Arguments

DataVec

[1:n] numerical vector of regular (equidistant) time series data.

SplitAt

Index of row where the DataVec is divided into test and train data. If not given n is used

ForecastHorizon

Scalar defining the timesteps to forecast ahead

Seasonality

Main saisonality of data, is used for generating batches of data. Default is 28

Scaled

TRUE: automatic scaling

ErrorLoss

Error for the loss function, either "MRD","SRD","MSE","MAE". Default is "MSE"

Epochs

Number of epochs to train the model, see batch_size in fit in [keras].

Neurons

Number of units per layer, see units in layer_lstm.

ActivationFunction

Defines the function of activation to use, please see [Goodfellow, 2016] for details.

RecurrentActivation

Defines the function of recurrent activation to use, please see [Goodfellow, 2016] for details.

Batch_size

Number of samples per gradient update, see batch_size in fit in [keras]. The batch size is the number of data samples in one forward/backward pass of a RNN before a weight update.

The batch size shouldn't be chosen too high in relation to the forecast_length.

Time

Optional, [1:n] character vector of Time in the length of data. [1:n] character vector of Time in the length of data.

PlotIt

Optional, FALSE (default), do nothing. TRUE: plots the forecast versus the validation set.

Silent

Optional, if FALSE, print diverse ouptuts of keras. Default is TRUE

...

Further arguments for layer_lstm

Details

In this approach the recurrent ANN has several internal parameters set as defined in deep learning, see [Goodfellow, 2016] for details. The last layer is a densely-connected NN layer within a time_distributed layer. Currently only one hidden-layer is set.

The epochs are the total number of forward/backward pass iterations. Typically more improves model performance unless overfitting occurs at which time the validation accuracy/loss will not improve.

data should be scaled between [-1,1] with "sound" distribution, see [Goodfellow, 2016; Mörchen 2006].

Gradients are vanishing if inputs between zero and one are multiplied several times, because then gradient can shrink to zero. The result is the weights would not change significantly in an recurrent ANN of many layers ('deep learning').

ErrorLoss defines the objective function which should be minimized, see loss in compile in [keras], if you want to use a pre-coded function. You can also put in custom loss functions if you write it in keras backend syntax. (e.g. tensor_srd. The 'Adam' optimizes is chosen here [Kingma/Ba, 2014].

Value

List of

Model

Pointer to an ANN model generated by keras, the model is not directly available in R

FitStats

Output of fit in [keras]

Forecast

Forecast generated by the ANN model where we put in the last portion of the training set of length forecast_length as data to predict from. The test data stays untouched.

TestData

[(k+1):n] vector, the part of Response not used in the model

TestTime

[(k+1):n] vector, time of response not used in the model

TrainData

[1:k] vector, the part of Response used in the model

TrainTime

[1:k] vector, time of Training data if given

TrainingForecast

[1:k] vector, forecasted value using TrainData

Note

# keras and tensorflow have to be installed in python, python can be called from console

# Steps are:

devtools::install_github("rstudio/tensorflow")

devtools::install_github("rstudio/keras")

# Execute the below

tensorflow::install_tensorflow()

tensorflow::tf_config()

#Todo: Integrate Dropout (removeing units from NNs during training) to improve generalisation (Hinton et al., 2012).

Author(s)

Michael Thrun

References

Goodfellow, I., Bengio, Y., Courville, A., & Bengio, Y.: Deep learning, (Vol. 1), Cambridge: MIT press, 2016.

Kingma, D. P., Ba, J.: Adam: A method for stochastic optimization, arXiv preprint arXiv:1412.6980, 2014.

Hochreiter, Sepp, Jürgen Schmidhuber: "Long short-term memory.", Neural computation, Vol 9.8, pp. 1735-1780, 1997.

Mörchen, Fabian; Time series knowledge mining. Görich & Weiershäuser, 2006.

Hinton, G. E., Srivastava, N., Krizhevsky, A., Sutskever, I., & Salakhutdinov, R. R.: Improving neural networks by preventing co-adaptation of feature detectors. arXiv preprint arXiv:1207.0580, 2012.

See Also

keras and tensorflow.

Examples

# Sunspots with autocorrelation for a lag of 10 years above 0.5
# (aximum at 125 months
data = datasets::sunspot.month

# scale the subset, reduce the extent of outliers by sqrt
sub = sqrt(data)
quants = quantile(data, c(0.01, 0.5, 0.99), na.rm = F)
min = quants[1]
max = quants[3]
denom = max - min
data = (data - min) / denom

data=as.numeric(data)
# We are ready to apply the LSTM procedure with a batch_size, 
## Not run: 
results = FcLSTM( data, ForecastHorizon = 1, Batch_size = 40,Seasonality=48, Epochs=300,ErrorLoss="MRD")

# Get the forecast data from the returned
fc = results$Forecast

# Rescale the forecast data to be comparable to the original dataset
fc_rescaled = (denom * fc + min)^2

# Plot out the forecast (in tail use the forecast_length, here 120)
plot(tail(data, 120), type="l")
points(fc_rescaled, col="red")

## End(Not run)

Mthrun/TSAT documentation built on Feb. 5, 2024, 11:15 p.m.