extract_features: Extract features from time series
In FlukeAndFeather/tsrf: Time Series Random Forest Modeling

Description Usage Arguments Details Value Examples

View source: R/features.R

Calculate summary statistics of randomly sampled intervals in time series for classification in a time series random forest.

extract_features(
  x,
  tsid,
  intervals,
  funs = list(mean = mean, sd = sd, slope = slope)
)

extract_features_cpp(x, tsid, intervals)

extract_features_par(x, tsid, intervals, ncores)

`x`	time series (data.frame)
`tsid`	name of time series identifier column in `x` (character scalar)
`intervals`	start and end indices of intervals (2 column integer matrix; see `sample_intervals`)
`funs`	list of summary functions (`mean`, `sd`, and `slope` by default). Only used by `extract_features()`.
`ncores`	number of cores to use

extract_features() is an R implementation and is therefore slow for larger datasets. extract_features_cpp() uses C++ to improve feature extraction performance, but the summary statistics (mean, sd, slope) are inflexible. extract_features_par() applies extract_features_cpp() in parallel. extract_features_par() is faster than extract_features_cpp() when extracting features from very large datasets or using many cores. For smaller datasets or fewer cores, extract_features_cpp() can be faster.

an MxN matrix where M is the length of intervals and N is the length of funs.

ts_len <- 100
ts_dat <- data.frame(id = rep(1:3, each = ts_len),
                     val = 1:ts_len)
ints <- sample_intervals(ts_len)
extract_features(ts_dat, "id", ints)