ts_dist_part: Calculate distances between pairs of time series in part of a...

View source: R/ts_dist.R

ts_dist_partR Documentation

Calculate distances between pairs of time series in part of a list.

Description

This function is particularly useful to run in parallel as jobs in a cluster (HPC). It returns a data frame with elements (i,j) and a distance value calculated for the time series i and j. Not all the elements are calculated but just a a part of the total combinations of time series in the list. This function load all the time series in the memory to make the calculations faster. However, if the time series are too long and/or the dataset is huge, it might represent a memory problem. In this case, dist_dir_parallel() is more recommended.

Usage

ts_dist_part(
  ts_list,
  num_part,
  num_total_parts,
  combinations,
  dist_func = tsdist_cor,
  isSymetric = TRUE,
  error_value = NaN,
  warn_error = TRUE,
  simplify = TRUE,
  num_cores = 1,
  ...
)

Arguments

ts_list

List of time series.

num_part

Numeric positive between 1 and the total number of parts (num_total_parts). This value corresponds to the part (chunk) of the total number of parts to be calculated.

num_total_parts

Numeric positive corresponding the total number of parts.

combinations

A list composed by arrays of size 2 indicating the files indices to be compared. If this parameter is passed, then the function does not split all the possibilities and does not use the parameters num_part and num_total_parts. This parameter is useful when the number of combinations is very high and this functions is called several times (high num_total_parts). In this case, instead of calculating all the combinations in each call, the user can calculate it once and pass it via this parameter.

dist_func

Function to be applied to all combinations of time series. This function should have at least two parameters for each time series. Ex: function(ts1, ts2) cor(ts1, ts2)

isSymetric

Boolean. If the distance function is symmetric.

error_value

The value returned if an error occur when calculating a the distance for a pair of time series.

warn_error

Boolean. If TRUE (default), a warning will rise when an error occur during the calculations.

simplify

Boolean. If FALSE, returns a list of one (if isSymetric == FALSE) or two elements (if isSymetric == TRUE).

num_cores

Numeric. Number of cores

...

Additional parameters for measureFunc

Value

A data frame with elements (i,j) and a distance value calculated for the time series i and j.


ts2net documentation built on June 9, 2022, 9:06 a.m.