tulip | R Documentation |
Given a univariate time series y
, tulip
fits an exponential smoothing
model using maximum-a-posteriori (MAP) estimation. Prior distributions for
the smoothing parameters, the error component, and the probability of
anomalies can be provided via priors
. The set of available parameter
combinations is defined by param_grid
over which the optimization of the
MAP procedure takes place. The error and trend component are additive (if
used), whereas the seasonal component can be additive or multiplicative.
tulip(
y,
m,
method = c("additive", "multiplicative")[1],
family = c("norm", "student", "cauchy")[1:2],
param_grid = NULL,
priors = NULL,
init_states = NULL,
seasonality_threshold = 0.5,
remove_anomalies = TRUE,
anomaly_budget = 5,
anomaly_budget_most_recent_k = 1,
min_obs_anomaly_removal = 12,
try_fixed_initial_fit = FALSE,
check_param_grid_unique = TRUE
)
y |
A time series as numeric vector, may include NAs for some (but not all) of the observations |
m |
The time series' period length in number of observations (for example, 12 for yearly seasonality with monthly observations, 1 for no seasonality, ...); does not handle multiple seasonality (e.g. weekly and yearly for daily observations) |
method |
One of |
family |
The distribution used to describe the error component; must be
one or multiple of |
param_grid |
Matrix defining the grid of parameters to be trialled
during grid search optimization; default parameter grid will be used if
left |
priors |
List of priors on the models parameters; default priors will be
used if left |
init_states |
List of initial states |
seasonality_threshold |
During initialization of states, only use a
seasonal component if more than |
remove_anomalies |
Logical; during fitting, anomalies can be identified
and interpolated to not adversely affect the fitted states. The
interpolated values are only used to fit the states. When it comes to
estimating the error's standard deviation |
anomaly_budget |
Integerish (default 5); the number of anomalies that can be interpolated during fitting of state components. It can be useful to set a somewhat low hard limit on the number of possible interpolations as some parameter grid combinations may be misspecified compared to the "correct" data generating process and would therefore consider every observation an anomaly. |
anomaly_budget_most_recent_k |
Integerish (default 1); additional budget
reserved to remove anomalies from the |
min_obs_anomaly_removal |
Integerish (default 12); the anomaly detection relies on the fitted values' errors' standard deviation. The standard deviation is iteratively updated as the state components are fitted from the first observation to the last observation. This parameter defines after which observation of the time series there are sufficiently many observations available to reliably estimate the error's standard deviation and thus determine whether an observation should be considered an anomaly. The default of 12 is useful for monthly observations and not too low. But it also implies that any anomaly in the first 12 observations will have an impact on the estimated state components. |
try_fixed_initial_fit |
Logical (default |
check_param_grid_unique |
Logical (default |
An object of class tulip
, a list with components:
Fitted values, numeric vector of same length as y
Input time series
Copy of the input time series that is used to update
the state components during model fitting. The copy is
used to achieve update behavior specific to the states
when dealing with missing values and anomalies. In
contrast to y
, missing values and anomalies are replaced
to update the states in robust ways. Note that the update
behavior for sigma
may differ, see also y_na
.
Number of cleaned observations, integer
Copy of the input time series that is used to update the
sigma
parameter during model fitting. The copy is used to
achieve update behavior specific to sigma
when dealing
with missing values and anomalies. At the moment, anomalies
are not cleaned in y_na
, such that sigma
will eventually
adjust towards them. This behavior might change in the future.
Set of smoothing parameters of fitted model
Fitted scale parameter of likelihood function
Fitted level state component
Fitted trend state component
Fitted seasonality state component
Initial level state component
Initial trend state component
Initial seasonality state component
Value of the (log) joint distribution at the chosen parameter values
The distribution family of the fitted model
The suspected seasonality period length
Either additive
or multiplicative
seasonal component
The list of priors used during the model estimation. This
can deviate from the user-provided priors
argument
when the user did not provide a prior for all parameters.
The returned list lists all priors that were effectively
used.
A character string; missing value in the normal case, else describing a predefined exception, e.g. in the case of a very short input series
List containing all fitted models for the full parameter grid
predict.tulip()
, initialize_states()
, initialize_params_grid()
,
add_prior_level()
, add_prior_trend()
, add_prior_seasonality()
,
add_prior_error()
, add_prior_anomaly()
fitted_model <- tulip(y = tulip::flowers$flowers, m = 12)
print(fitted_model$family)
print(fitted_model$param_grid)
plot(tulip::flowers$flowers, type = "l", col = "grey", xlab = NA)
points(tulip::flowers$flowers, pch = 21, bg = "black", col = "white")
# add fitted values
lines(fitted_model$y_hat, col = "blue")
# indicate observations identified as anomalies and consequently interpolated
idx_anomalies <- which(fitted_model$y_cleaned != tulip::flowers$flowers)
points(
x = idx_anomalies,
y = fitted_model$y_cleaned[idx_anomalies],
col = "darkorange", pch = 19
)
points(
x = idx_anomalies,
y = fitted_model$y_cleaned[idx_anomalies],
col = "white", pch = 21
)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.