extract_features: Extract features from time series

Description Usage Arguments Details Value Examples

View source: R/features.R

Description

Calculate summary statistics of randomly sampled intervals in time series for classification in a time series random forest.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
extract_features(
  x,
  tsid,
  intervals,
  funs = list(mean = mean, sd = sd, slope = slope)
)

extract_features_cpp(x, tsid, intervals)

extract_features_par(x, tsid, intervals, ncores)

Arguments

x

time series (data.frame)

tsid

name of time series identifier column in x (character scalar)

intervals

start and end indices of intervals (2 column integer matrix; see sample_intervals)

funs

list of summary functions (mean, sd, and slope by default). Only used by extract_features().

ncores

number of cores to use

Details

extract_features() is an R implementation and is therefore slow for larger datasets. extract_features_cpp() uses C++ to improve feature extraction performance, but the summary statistics (mean, sd, slope) are inflexible. extract_features_par() applies extract_features_cpp() in parallel. extract_features_par() is faster than extract_features_cpp() when extracting features from very large datasets or using many cores. For smaller datasets or fewer cores, extract_features_cpp() can be faster.

Value

an MxN matrix where M is the length of intervals and N is the length of funs.

Examples

1
2
3
4
5
ts_len <- 100
ts_dat <- data.frame(id = rep(1:3, each = ts_len),
                     val = 1:ts_len)
ints <- sample_intervals(ts_len)
extract_features(ts_dat, "id", ints)

FlukeAndFeather/tsrf documentation built on Dec. 17, 2021, 8:29 p.m.