| layer | R Documentation |
Current supported recurrent layer types and available loss functions in the package.
"RNN" (Simple Recurrent Neural Network) and "BiRNN":
A fully-connected RNN where the output from the previous time step is
fed back to the next time step. It's the most basic type of
recurrent layer but can struggle with long-term dependencies due to
the vanishing gradient problem.
"GRU" (Gated Recurrent Unit) and "BiGRU":
A modern recurrent unit that uses gating mechanisms to control
information flow, enabling it to capture long-range dependencies.
GRUs are generally simpler and computationally faster than LSTMs
while often achieving comparable performance.
"LSTM" (Long Short-Term Memory) and "BiLSTM" :
A powerful recurrent unit with dedicated memory cells and gating
mechanisms (input, forget, output). LSTMs excel at learning
long-term dependencies and are robust against the vanishing
gradient problem, making them ideal for very long sequences.
The loss function defines the objective that the model minimizes during training. The choice of loss function is critical as it determines what aspect of the prediction the model prioritizes.
"MSE" (Mean Squared Error):
Calculates the average of the squared differences between predicted
and true parameter values. By squaring the error, it heavily
penalizes large mistakes. It is the standard choice for regression
and implicitly assumes that the errors are normally distributed.
However, its sensitivity to outliers can sometimes be a drawback.
"MAE" (Mean Absolute Error):
Calculates the average of the absolute differences between predicted
and true values. It treats all errors equally on a linear scale,
making it more robust to outliers than MSE. It is a good choice
when the dataset contains anomalies that should not dominate the
training process.
"HBR" (Huber Loss):
A hybrid loss function that combines the best properties of MSE and
MAE. It behaves like MSE for small errors, providing a smooth and
stable gradient, but switches to behaving like MAE for large
errors. This makes it less sensitive to outliers than MSE while
remaining differentiable at zero.
"NLL" (Negative Log-Likelihood):
This loss is used for probabilistic regression. Instead of
predicting a single value for each parameter, the network
predicts the parameters of a probability distribution (here, a
Gaussian: its mean mu and variance sigma^2). The
loss is the negative log-likelihood of the true parameters under
the predicted distribution. This allows the model to learn and
express its own uncertainty about its predictions.
"QRL" (Quantile Regression Loss):
Allows the model to estimate specific quantiles of the parameter
distribution, rather than just its mean. This package's
implementation predicts the 5th, 50th (median), and 95th
percentiles. It uses a "pinball loss" function that is asymmetric,
guiding the model to the desired quantile. It is useful for
understanding the full range of parameter uncertainty and is
naturally robust to outliers.
"MDN" (Mixture Density Network):
The most flexible but complex option. An MDN learns to predict the
parameters of a mixture of distributions (e.g., a mix of multiple
Gaussians). This allows it to model highly complex, multi-modal
(multiple peaks), or skewed posterior distributions. The network
outputs the means, variances, and mixing weights for each
component in the mixture.
# supported recurrent layer and loss function
control = list(
layer = c("RNN", "GRU", "LSTM", "BiRNN", "BiGRU", "BiLSTM"),
loss = c("MSE", "MAE", "HBR", "NLL", "QRL", "MDN")
)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.