tk_tsfeatures: Time series feature matrix (Tidy)

View source: R/diagnostiscs-tsfeatures.R

tk_tsfeaturesR Documentation

Time series feature matrix (Tidy)


tk_tsfeatures() is a tidyverse compliant wrapper for tsfeatures::tsfeatures(). The function computes a matrix of time series features that describes the various time series. It's designed for groupwise analysis using dplyr groups.


  .period = "auto",
  .features = c("frequency", "stl_features", "entropy", "acf_features"),
  .scale = TRUE,
  .trim = FALSE,
  .trim_amount = 0.1,
  .parallel = FALSE,
  .na_action = na.pass,
  .prefix = "ts_",
  .silent = TRUE,



A tibble or data.frame with a time-based column


A column containing either date or date-time values


A column containing numeric values


The periodicity (frequency) of the time series data. Values can be provided as follows:

  • "auto" (default) Calculates using tk_get_frequency().

  • "2 weeks": Would calculate the median number of observations in a 2-week window.

  • 7 (numeric): Would interpret the ts frequency as 7 observations per cycle (common for weekly data)


Passed to features in the underlying tsfeatures() function. A vector of function names that represent a feature aggregation function. Examples:

  1. Use one of the function names from tsfeatures R package e.g.("lumpiness", "stl_features").

  2. Use a function name (e.g. "mean" or "median")

  3. Create your own function and provide the function name


If TRUE, time series are scaled to mean 0 and sd 1 before features are computed.


If TRUE, time series are trimmed by trim_amount before features are computed. Values larger than trim_amount in absolute value are set to NA.


Default level of trimming if trim==TRUE. Default: 0.1.


If TRUE, multiple cores (or multiple sessions) will be used. This only speeds things up when there are a large number of time series.

When .parallel = TRUE, the multiprocess = future::multisession. This can be adjusted by setting multiprocess parameter. See the tsfeatures::tsfeatures() function for mor details.


A function to handle missing values. Use na.interp to estimate missing values.


A prefix to prefix the feature columns. Default: "ts_".


Whether or not to show messages and warnings.


Other arguments get passed to the feature functions.


The timetk::tk_tsfeatures() function implements the tsfeatures package for computing aggregated feature matrix for time series that is useful in many types of analysis such as clustering time series.

The timetk version ports the tsfeatures::tsfeatures() function to a tidyverse-compliant format that uses a tidy data frame containing grouping columns (optional), a date column, and a value column. Other columns are ignored.

It then becomes easy to summarize each time series by group-wise application of .features, which are simply functions that evaluate a time series and return single aggregated value. (Example: "mean" would return the mean of the time series (note that values are scaled to mean 1 and sd 0 first))

Function Internals:

Internally, the time series are converted to ts class using tk_ts(.period) where the period is the frequency of the time series. Values can be provided for .period, which will be used prior to convertion to ts class.

The function then leverages tsfeatures::tsfeatures() to compute the feature matrix of summarized feature values.


A tibble or data.frame with aggregated features that describe each time series.


  1. Rob Hyndman, Yanfei Kang, Pablo Montero-Manso, Thiyanga Talagala, Earo Wang, Yangzhuoran Yang, Mitchell O'Hara-Wild: tsfeatures R package



walmart_sales_weekly %>%
    group_by(id) %>%
      .date_var = Date,
      .value    = Weekly_Sales,
      .period   = 52,
      .features = c("frequency", "stl_features", "entropy", "acf_features", "mean"),
      .scale    = TRUE,
      .prefix   = "ts_"

timetk documentation built on Nov. 2, 2023, 6:18 p.m.