anomalize
R package is now available in timetk
:
anomlize()
: 1 function that breaks down, identifies, and cleans anomaliesplot_anomalies()
: Visualize the anomalies and anomaly bandsplot_anomalies_decomp()
: Visualize the time series decomposition. Make adjustments as needed.plot_anomalies_cleaned()
: Visualize the before/after of cleaning anomalies. Note - anomalize(.method)
: Only the .method = "stl"
is supported at this time. The "twitter"
method is also planned.
Update forecasting vignette: Use glmnet
for time series forecasting.
CRAN Fixes:
- tzdata
time zone fixes:
- GB -> Europe/London
- NZ -> Pacific/Auckland
- US/Eastern -> America/New_York
- US/Pacific -> America/Los_Angeles
- Add @aliases
to timetk-package
robets
tidyquant
from examples tidyverse
from examplesFANG
dataset to timetk
(port from tidyquant
)New Features
plot_time_series()
: Gets new arguments to specify .x_intercept
and .x_intercept_color
. #131Fixes
plot_time_series()
when .group_names
is not found. #121recipes >= 1.0.3
#132facet_trelliscope()
plotting parameters. plot_time_series()
plot_time_series_boxplot()
plot_anomaly_diagnostics()
New Features
Many of the plotting functions have been upgraded for use with trelliscopejs
for
easier visualization of many time series.
plot_time_series()
:
trelliscope
: Used for visualizing many time series..facet_strip_remove
to remove facet strips since trelliscope is automatically labeled..facet_nrow
to adjust grid with trelliscope.facet_collapse = TRUE
was changed to FALSE
for better compatibility with Trelliscope JS. This may cause some plots to have multiple groups take up extra space in the strip.plot_time_series_boxplot()
:
trelliscope
: Used for visualizing many time series..facet_strip_remove
to remove facet strips since trelliscope is automatically labeled..facet_nrow
to adjust grid with trelliscope..facet_collapse = TRUE
was changed to FALSE
for better compatibility with Trelliscope JS. This may cause some plots to have multiple groups take up extra space in the strip.plot_anomaly_diagnostics()
:
trelliscope
: Used for visualizing many time series..facet_strip_remove
to remove facet strips since trelliscope is automatically labeled..facet_nrow
to adjust grid with trelliscope..facet_collapse = TRUE
was changed to FALSE
for better compatibility with Trelliscope JS. This may cause some plots to have multiple groups take up extra space in the strip.Updates & Bug Fixes
Recipes steps (e.g. step_timeseries_signature()
) use the new recipes::print_step()
function. Requires recipes >= 0.2.0
. #110
Offset parameter in step_log_interval()
was not working properly. Now works. #103
Potential Breaking Changes
.facet_collapse = TRUE
was changed to FALSE
for better compatibility with Trelliscope JS. This may cause some plots to have multiple groups take up extra space in the strip. New Features
tk_tsfeatures()
: A new function that makes it easy to generate time series feature matrix using tsfeatures
. The main benefit is that you can pipe time series data in tibbles
with dplyr
groups. The features will be produced by group. #95 #84
plot_time_series_boxplot()
: A new function that makes plotting time series boxplots simple using a .period
argument for time series aggregation.
New Vignettes
Time Series Clustering: Uses the new tk_tsfeatures()
function to perform time series clustering. #95 #84
Time Series Visualization: Updated to include plot_time_series_boxplot()
and plot_time_series_regression()
.
Improvements
Improvements for point forecasting when the target is n-periods into the future.
time_series_cv()
, time_series_split()
: New parameter point_forecast
. This is useful for testing / assessing the n-th prediction in the future. When set to TRUE
, will return a single point that returns on the last value in assess
. Fixes
plot_time_series()
: Smoother no longer fails when time series has 1 observation #106Improvements
summarize_by_time()
: Added a .week_start
argument to allow specifying .week_start = 1
for Monday start. Default is 7 for Sunday Start. This can also be changed with the lubridate
by setting the lubridate.week.start
option.
Plotting Functions:
Several plotting functions gain a new .facet_dir
argument for adjusting the direction of facet_wrap(dir)
. #94
plot_acf_diagnostics()
): Change default parameter to .show_white_noise_bars = TRUE
. #85plot_timeseries_regression()
: Can now show_summary
for group-wise models when visualizing groups
Time Series CV (time_series_cv()
): Add Label for tune_results
Improve speed of pad_by_time()
. #93
Bug Fixes
tk_make_timeseries()
and tk_make_future_timeseries()
are now able to handle end of months. #72
tk_tbl.zoo()
: Fix an issue when readr::type_convert()
produces warning messages about not having character columns in inputs. #89
plot_time_series_regression()
: Fixed an issue when lags are added to .formula
. Pads lags with NA.
step_fourier()
and fourier_vec()
: Fixed issue with step_fourier failing with one observation. Added scale_factor argument to override date sequences with the stored scale factor. #77
Improvements
tk_augment_slidify()
, tk_augment_lags()
, tk_augment_leads()
, tk_augment_differences()
: Now works with multiple columns (passed via .value
) and tidyselect
(e.g. contains()
).Fixes
#> New names:
#> * NA -> ...1
lazyeval
. #24select_()
used with tk_xts_()
. #52New Functions
filter_period()
(#64): Applies filtering expressions within time-based periods (windows). slice_period()
(#64): Applies slices within time-based periods (windows).condense_period()
(#64): Converts a periodicity from a higher (e.g. daily) to lower (e.g. monthly) frequency. Similar to xts::to.period()
and tibbletime::as_period()
.tk_augment_leads()
and lead_vec()
(#65): Added to make it easier / more obvious on how to create leads. Fixes
time_series_cv()
: Fix bug with Panel Data. Train/Test Splits only returning 1st observation in final time stamp. Should return all observations. future_frame()
and tk_make_future_timeseries()
: Now sort the incoming index to ensure dates returned go into the future. tk_augment_lags()
and tk_augment_slidify()
: Now overwrite column names to match the behavior of tk_augment_fourier()
and tk_augment_differences()
.Improvements
time_series_cv()
: Now works with time series groups. This is great for working with panel data. future_frame()
: Gets a new argument called .bind_data
. When set to TRUE
, it performs a data
binding operation with the incoming data and the future frame. Miscellaneous
step_slidify_augment()
- A variant of step slidify that adds multiple rolling columns inside of a recipe. Bug Fixes
%+time%
and %-time%
return missing valuestk_make_timeseries()
and tk_make_future_timeseries()
providing odd results for regular time series. GitHub Issue 60New Functionality
tk_time_series_cv_plan()
- Now works with k-fold cross validation objects from vfold_cv()
function.
pad_by_time()
- Added new argument .fill_na_direction
to specify a tidyr::fill()
strategy for filling missing data.
Bug Fixes
tk_augment_lags()
) - Fix bug with grouped functions not being exportedts
class New Functions
step_log_interval_vec()
- Extends the log_interval_vec()
for recipes
preprocessing.Parallel Processing
tune
and recipes
Bug Fixes
log_interval_vec()
- Correct the messagingcomplement.ts_cv_split
- Helper to show time series cross validation splits in list explorer. New Functions
mutate_by_time()
: For applying mutates by time windowslog_interval_vec()
& log_interval_inv_vec()
: For constrained interval forecasting. Improvements
plot_acf_diagnostics()
: A new argument, .show_white_noise_bars
for adding white noise bars to an ACF / PACF Plot.pad_by_time()
: New arguments .start_date
and .end_date
for expanding/contracting the padding windows. New Functions
plot_time_series_regression()
: Convenience function to visualize & explore features using Linear Regression (stats::lm()
formula).time_series_split()
: A convenient way to return a single split from time_series_cv()
. Returns the split in the same format as rsample::initial_time_split()
.Improvements
summarise_by_time()
, filter_by_time()
, tk_summary_diagnostics
tk_time_series_cv_plan()
: Allow a single resample from rsample::initial_time_split
or timetk::time_series_split
modeltime
and tidymodels
. Plotting Improvements
plot_time_series()
: .legend_show
to toggle on/off legends. Breaking Changes
...
with .facet_vars
or .ccf_vars
. This change is needed to improve tab-completion. It affects : plot_time_series()
plot_acf_diagnostics()
plot_anomaly_diagnostics()
plot_seasonal_diagnostics()
plot_stl_diagnostics()
Bug Fixes
fourier_vec()
and step_fourier_vec()
: Add error if observations have zero difference. Issue #40.New Interactive Plotting Functions
plot_anomaly_diagnostics()
: Visualize Anomalies for One or More Time SeriesNew Data Wrangling Functions
future_frame()
: Make a future tibble from an existing time-based tibble.New Diagnostic / Data Processing Functions
tk_anomaly_diagnostics()
- Group-wise anomaly detection and diagnostics. A wrapper for the anomalize
R package functions without importing anomalize
. New Vectorized Functions:
ts_clean_vec()
- Replace Outliers & Missing Values in a Time Seriesstandardize_vec()
- Centers and scales a time series to mean 0, standard deviation 1normalize_vec()
- Normalizes a time series to Range: (0, 1)New Recipes Preprocessing Steps:
step_ts_pad()
- Preprocessing for padding time series data. Adds rows to fill in gaps and can be used with step_ts_impute()
to interpolate going from low to high frequency!step_ts_clean()
- Preprocessing step for cleaning outliers and imputing missing values in a time series.New Parsing Functions
parse_date2()
and parse_datetime2()
: These are similar to readr::parse_date()
and lubridate::as_date()
in that they parse character vectors to date and datetimes. The key advantage is SPEED. parse_date2()
uses anytime
package to process using C++ Boost.Date_Time
library.Improvements:
plot_acf_diagnostics()
: The .lags
argument now handles time-based phrases (e.g. .lags = "1 month"
).time_series_cv()
: Implements time-based phrases (e.g. initial = "5 years"
and assess = "1 year"
)tk_make_future_timeseries()
: The n_future
argument has been deprecated for a new length_out
argument that accepts both numeric input (e.g. length_out = 12
) and time-based phrases (e.g. length_out = "12 months"
). A major improvement is that numeric values define the number of timestamps returned even if weekends are removed or holidays are removed. Thus, you can always anticipate the length. (Issue #19).diff_vec
: Now reports the initial values used in the differencing calculation. Bug Fixes:
plot_time_series()
: .value = .value
. tk_make_future_timeseries()
: time_series_cv()
: skip = 1
default. skip = 0
does not make sense. skip
adding 1 to stops. plot_time_series_cv_plan()
& tk_time_series_cv_plan()
: tk_make_future_timeseries()
: period()
returns NA
. Fix implemented with ceiling_date()
.pad_by_time()
: pad_value
so only inserts pad values where new row was inserted. step_ts_clean()
, step_ts_impute()
: lambda = NULL
Breaking Changes:
These should not be of major impact since the 1.0.0 version was just released.
impute_ts_vec()
to ts_impute_vec()
for consistency with ts_clean_vec()
step_impute_ts()
to step_ts_impute()
for consistency with underlying functionroll_apply_vec()
to slidify_vec()
for consistency with slidify()
& relationship to slider
R packagestep_roll_apply
to step_slidify()
for consistency with slidify()
& relationship to slider
R packagetk_augment_roll_apply
to tk_augment_slidify()
for consistency with slidify()
& relationship to slider
R packageplot_time_series_cv_plan()
and tk_time_series_cv_plan()
: Changed argument from .rset
to .data
. New Interactive Plotting Functions:
plot_time_series()
- A workhorse time-series plotting function that generates interactive plotly
plots, consolidates 20+ lines of ggplot2
code, and scales well to many time series using dplyr groups. plot_acf_diagnostics()
- Visualize the ACF, PACF, and any number of CCFs in one plot for Multiple Time Series. Interactive plotly
by default. plot_seasonal_diagnostics()
- Visualize Multiple Seasonality Features for One or More Time Series. Interactive plotly
by default. plot_stl_diagnostics()
- Visualize STL Decomposition Features for One or More Time Series.plot_time_series_cv_plan()
- Visualize the Time Series Cross Validation plan made with time_series_cv()
.New Time Series Data Wrangling:
summarise_by_time()
- A time-based variant of dplyr::summarise()
for flexible summarization using common time-based criteria. filter_by_time()
- A time-based variant of dplyr::filter()
for flexible filtering by time-ranges.pad_by_time()
- Insert time series rows with regularly spaced timestamps.slidify()
- Make any function a rolling / sliding function. between_time()
- A time-based variant of dplyr::between()
for flexible time-range detection. add_time()
- Add for time series index. Shifts an index by a period
. New Recipe Functions:
Feature Generators:
step_holiday_signature()
- New recipe step for adding 130 holiday features based on individual holidays, locales, and stock exchanges / business holidays. step_fourier()
- New recipe step for adding fourier transforms for adding seasonal features to time series datastep_roll_apply()
- New recipe step for adding rolling summary functions. Similar to recipes::step_window()
but is more flexible by enabling application of any summary function. step_smooth()
- New recipe step for adding Local Polynomial Regression (LOESS) for smoothing noisy time seriesstep_diff()
- New recipe for adding multiple differenced columns. Similar to recipes::step_lag()
.step_box_cox()
- New recipe for transforming predictors. Similar to step_BoxCox()
with improvements for forecasting including "guerrero" method for lambda selection and handling of negative data. step_impute_ts()
- New recipe for imputing a time series. New Rsample Functions
time_series_cv()
- Create rsample
cross validation sets for time series. This function produces a sampling plan starting with the most recent time series observations, rolling backwards. New Vector Functions:
These functions are useful on their own inside of mutate()
and power many of the new plotting and recipes functions.
roll_apply_vec()
- Vectorized rolling apply function - wraps slider::slide_vec()
smooth_vec()
- Vectorized smoothing function - Applies Local Polynomial Regression (LOESS)diff_vec()
and diff_inv_vec()
- Vectorized differencing function. Pads NA
's by default (unlike stats::diff
).lag_vec()
- Vectorized lag functions. Returns both lags and leads (negative lags) by adjusting the .lag
argument. box_cox_vec()
, box_cox_inv_vec()
, & auto_lambda()
- Vectorized Box Cox transformation. Leverages forecast::BoxCox.lambda()
for automatic lambda selection. fourier_vec()
- Vectorized Fourier Series calculation.impute_ts_vec()
- Vectorized imputation of missing values for time series. Leverages forecast::na.interp()
.New Augment Functions:
All of the functions are designed for scale. They respect dplyr::group_by()
.
tk_augment_holiday_signature()
- Add holiday features to a data.frame
using only a time-series index.tk_augment_roll_apply()
- Add multiple columns of rolling window calculations to a data.frame
.tk_augment_differences()
- Add multiple columns of differences to a data.frame
. tk_augment_lags()
- Add multiple columns of lags to a data.frame
. tk_augment_fourier()
- Add multiple columns of fourier series to a data.frame
.New Make Functions:
Make date and date-time sequences between start and end dates.
tk_make_timeseries()
- Super flexible function for creating daily and sub-daily time series. tk_make_weekday_sequence()
- Weekday sequence that accounts for both stripping weekends and holidaystk_make_holiday_sequence()
- Makes a sequence of dates corresponding to business holidays in calendars from timeDate
(common non-working days)tk_make_weekend_sequence()
- Weekday sequence of dates for Saturday and Sunday (common non-working days)New Get Functions:
tk_get_holiday_signature()
- Get 100+ holiday features using only a time-series index.tk_get_frequency()
and tk_get_trend()
- Automatic frequency and trend calculation from a time series index. New Diagnostic / Data Processing Functions
tk_summary_diagnostics()
- Group-wise time series summary. tk_acf_diagnostics()
- The data preparation function for plot_acf_diagnostics()
tk_seasonal_diagnostics()
- The data preparation function for plot_seasonal_diagnostics()
tk_stl_diagnostics()
- Group-wise STL Decomposition (Season, Trend, Remainder). Data prep for plot_stl_diagnostics()
.tk_time_series_cv_plan
- The data preparation function for plot_time_series_cv_plan()
New Datasets
Improvements:
* tk_make_future_timeseries()
- Now accepts n_future
as a time-based phrase like "12 seconds" or "1 year".
Bug Fixes:
lubridate::tz<-
which now returns POSIXct when used Date objects. Fixed in PR32 by @vspinu. Potential Breaking Changes:
tk_augment_timeseries_signature()
- Changed from data
to .data
to prevent name collisions when piping. New Features:
recipes
Integration - Ability to apply time series feature engineering in the tidymodels
machine learning workflow. step_timeseries_signature()
- New step_timeseries_signature()
for adding date and date-time features.Bug Fixes:
xts::indexTZ
is deprecated. Use tzone
instead.arrange_
with arrange
.tidyquant
1.0.0 upagrade (single stocks now return an extra symbol column).tidyquant
v0.5.7 - Removed dependency on tidyverse
timeSeries
to Suggests to satisfy a CRAN issue.timetk
. Was formerly timekit
. robets
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.