ccm_over_library_sizes: Performs CCM over multiple library sizes. This function only...

Description Usage Arguments

Description

Performs CCM over multiple library sizes. This function only exists to allow parallelisation over library sizes at the lowermost level.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
ccm_over_library_sizes(lag, data, E = 2, tau = 1, library.sizes = 100,
  low.libsize = min(E * tau + max(2, abs(lag)), 10, 20, na.rm = T),
  n.libsizes.to.check = 30, high.libsize = max(library.sizes), lib = c(1,
  dim(data)[1]), pred = lib, samples.original = 100,
  samples.surrogates = 0, n.surrogates = 0, surrogate.method = "AAFT",
  time.unit = "bins", time.bin.size = 1, num.neighbours = E + 1,
  random.libs = TRUE, with.replacement = TRUE, exclusion.radius = E,
  epsilon = NULL, RNGseed = 1111, parallel = F, time.run = F,
  print.to.console = F, time.series.length.threshold = 100,
  library.column = 1, target.column = 2, surrogate.column = target.column,
  silent = T)

Arguments

lag

The lag (called prediction horizon in rEDM::ccm) for which to compute CCM.

data

A data frame containing two columns - one for the presumed driver and one for the response.

E

The embedding dimension. Defaults to NULL, which triggers automated optimisation of the embedding dimension up to the dimension specified by 'max.E'.

tau

The embedding lag. Defaults to NULL, which triggers automated optimisation of the embedding lag up to the dimension specified by 'max.tau'. For sparsely sampled time series (for example geological time series), it is wise to set this value to 1. For densely sampled time series, this should be set to the first minima of the autocorrelation function of the presumed driver.

library.sizes

Either a single maximum library size (cross mapping is performed for a range of value from the smallest possible library size to the provided library size) or a user-specified range of library sizes. If user-provided, make sure to provide at least 20 different library sizes to ensure robust convergence assessment.

low.libsize

If one library size is specified, cross map for library sizes ranging from 'low.libsize' to 'high.libsize'.

n.libsizes.to.check

Minimum number of library sizes for the convergence test.

high.libsize

If one library size is specified, cross map for library sizes ranging from 'low.libsize' to 'high.libsize'. #' @param data A data frame containing two columns - one for the presumed driver and one for the response.

lib

Indices of the original library time series to use as the library (training) set.

pred

Indices of the original target time series to use as prediction set. If this overlaps with the training set, make sure to use leave-K-out cross validation setting the 'exclusion.radius' parameters to a minimum of E + 1.

samples.original

The number of random libraries to draw when calculating the cross map skill.

samples.surrogates

The number of surrogate time series in the null ensemble.

n.surrogates

Should a surrogate test also be performed? If so, 'n.surrogates' sets the number of surrogate time series to use. By default, no surrogate test is performed (n.surrogates = 0).

surrogate.method

Which method should be used to generate surrogate time series? Defaults to "AAFT". For more options, see the description of the 'surrogate_ensemble' function in this package.

time.unit

The time unit of the raw time series.

time.bin.size

The temporal resolution of the raw time series (given in the units indicated by 'time.unit').

num.neighbours

The number of nearest neighbours to use in predictions. Defaults to E + 1.

random.libs

Whether or not to sample random library (training) sets. Defaults to TRUE.

with.replacement

Should samples be drawn with replacement? Defaults to TRUE.

exclusion.radius

The number of temporal neighbours to exclude for the leave-K-out cross validation. Defaults to E + 1.

epsilon

Exlude neighbours if the are within a distance of 'epsilon' from the predictee.

RNGseed

A random number seed. For reproducibility.

parallel

Activate parallellisation? Defaults to true. Currently, this only works decently on Mac and Linux systems.

time.run

Time the run?

print.to.console

Display progress?

time.series.length.threshold

Display a warning if the time series length drops below this threshold.

library.column

Integer indicating which column to use as the library column (presumed response).

target.column

Integer indicating which column to use as the target column (presumed driver). Defaults to the opposite of 'library.column'.

surrogate.column

Which column to use to generate surrogates. Defaults to the value of 'target.column' (the presumed driver).

silent

Suppress warnings?


kahaaga/tstools documentation built on May 24, 2019, 5:01 a.m.