zoo_resample | R Documentation |
Objective
Time series resampling involves interpolating new values for time steps not available in the original time series. This operation is useful to:
Transform irregular time series into regular.
Align time series with different temporal resolutions.
Increase (upsampling) or decrease (downsampling) the temporal resolution of a time series.
On the other hand, time series resampling should not be used to extrapolate new values outside of the original time range of the time series, or to increase the resolution of a time series by a factor of two or more. These operations are known to produce non-sensical results.
Methods This function offers three methods for time series interpolation:
"linear" (default): interpolation via piecewise linear regression as implemented in zoo::na.approx()
.
"spline": cubic smoothing spline regression as implemented in stats::smooth.spline()
.
"loess": local polynomial regression fitting as implemented in stats::loess()
.
These methods are used to fit models y ~ x
where y
represents the values of a univariate time series and x
represents a numeric version of its time.
The functions utils_optimize_spline()
and utils_optimize_loess()
are used under the hood to optimize the complexity of the methods "spline" and "loess" by finding the configuration that minimizes the root mean squared error (RMSE) between observed and predicted y
. However, when the argument max_complexity = TRUE
, the complexity optimization is ignored, and a maximum complexity model is used instead.
New time
The argument new_time
offers several alternatives to help define the new time of the resulting time series:
NULL
: the target time series (x
) is resampled to a regular time within its original time range and number of observations.
zoo object
: a zoo object to be used as template for resampling. Useful when the objective is equalizing the frequency of two separate zoo objects.
time vector
: a time vector of a class compatible with the time in x
.
keyword
: character string defining a resampling keyword, obtained via zoo_time(x, keywords = "resample")$keywords
..
numeric
: a single number representing the desired interval between consecutive samples in the units of x
(relevant units can be obtained via zoo_time(x)$units
).
Step by Step
The steps to resample a time series list are:
The time interpolation range taken from the index of the zoo object. This step ensures that no extrapolation occurs during resampling.
If new_time
is provided, any values of new_time
outside of the minimum and maximum interpolation times are removed to avoid extrapolation. If new_time
is not provided, a regular time within the interpolation time range of the zoo object is generated.
For each univariate time time series, a model y ~ x
, where y
is the time series and x
is its own time coerced to numeric is fitted.
If max_complexity == FALSE
and method = "spline"
or method = "loess"
, the model with the complexity that minimizes the root mean squared error between the observed and predicted y
is returned.
If max_complexity == TRUE
and method = "spline"
or method = "loess"
, the first valid model closest to a maximum complexity is returned.
The fitted model is predicted over new_time
to generate the resampled time series.
Other Details
Please use this operation with care, as there are limits to the amount of resampling that can be done without distorting the data. The safest option is to keep the distance between new time points within the same magnitude of the distance between the old time points.
zoo_resample(
x = NULL,
new_time = NULL,
method = "linear",
max_complexity = FALSE
)
x |
(required, zoo object) Time series to resample. Default: NULL |
new_time |
(optional, zoo object, keyword, or time vector) New time to resample
|
method |
(optional, character string) Name of the method to resample the time series. One of "linear", "spline" or "loess". Default: "linear". |
max_complexity |
(required, logical). Only relevant for methods "spline" and "loess". If TRUE, model optimization is ignored, and the a model of maximum complexity (an overfitted model) is used for resampling. Default: FALSE |
zoo object
Other zoo_functions:
zoo_aggregate()
,
zoo_name_clean()
,
zoo_name_get()
,
zoo_name_set()
,
zoo_permute()
,
zoo_plot()
,
zoo_smooth_exponential()
,
zoo_smooth_window()
,
zoo_time()
,
zoo_to_tsl()
,
zoo_vector_to_matrix()
#simulate irregular time series
x <- zoo_simulate(
cols = 2,
rows = 50,
time_range = c("2010-01-01", "2020-01-01"),
irregular = TRUE
)
#plot time series
if(interactive()){
zoo_plot(x)
}
#intervals between samples
x_intervals <- diff(zoo::index(x))
x_intervals
#create regular time from the minimum of the observed intervals
new_time <- seq.Date(
from = min(zoo::index(x)),
to = max(zoo::index(x)),
by = floor(min(x_intervals))
)
new_time
diff(new_time)
#resample using piecewise linear regression
x_linear <- zoo_resample(
x = x,
new_time = new_time,
method = "linear"
)
#resample using max complexity splines
x_spline <- zoo_resample(
x = x,
new_time = new_time,
method = "spline",
max_complexity = TRUE
)
#resample using max complexity loess
x_loess <- zoo_resample(
x = x,
new_time = new_time,
method = "loess",
max_complexity = TRUE
)
#intervals between new samples
diff(zoo::index(x_linear))
diff(zoo::index(x_spline))
diff(zoo::index(x_loess))
#plotting results
if(interactive()){
par(mfrow = c(4, 1), mar = c(3,3,2,2))
zoo_plot(
x,
guide = FALSE,
title = "Original"
)
zoo_plot(
x_linear,
guide = FALSE,
title = "Method: linear"
)
zoo_plot(
x_spline,
guide = FALSE,
title = "Method: spline"
)
zoo_plot(
x_loess,
guide = FALSE,
title = "Method: loess"
)
}
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.