# Surrogate data generation

### Description

This function can be used to generate surrogate time series via various frequency domain bootstrapping techniques. Bootstrapping has been used (in the statistics community) to assess the sampling variability of certain statistics. The nonlinear dynamics community typically uses bootstrapping to detect nonlinear structure in stationary time series. Given a time series, this function is used to generate surrogate series via Theiler's Amplitude Adjusted Fourier Transform (AAFT), Theiler's phase randomization, Davies and Harte's Circulant Embedding (CE) technique, or Davison and Hinkley's (DH) phase and amplitude randomization technique.

Theiler's techniques produce so-called *constrained realizations*
since some statistical aspect of the original data preserved
(the histogram for the AAFT and the periodogram for the phase randomization).
The other techniques, ciruclant embedding and Davison-Hinkley,
are non-constrained as both the amplitudes and phases
of the original series are randomized.

### Usage

1 |

### Arguments

`x` |
a vector containing a uniformly-sampled real-valued time series. |

`method` |
a character string representing the method to be used to generate surrogate data. Choices are: `"aaft"` Theiler's Amplitude Adjusted Fourier Transform. `"phase"` Theiler's phase randomization. `"ce"` Davies and Harte's Circulant Embedding. `"dh"` Davison and Hinkley's phase and amplitude randomization.
Default: |

`sdf` |
an object of class |

`seed` |
a positive integer representing the initial seed value to use
for the random number generator. If |

### Details

The algorithms are detailed as follows:

- phase
The discrete Fourier transform of a time series is calculated and the phase at each frequency is randomized to be uniformly distributed on

*[0, 2*PI]*. Phase symmetry is preserved so that an inverse DFT forms a purely real surrogate. Null hypothesis: the original data come from a linear Gaussian process. Side effect: the periodogram of the surrogate and original time series are the same.- aaft
An

*N*-point normally distributed realization of a white noise process is created, where*N*is the length of`x`

, and sorted to have the same rank as`x`

(e.g., if*rank(x[t]) = 5*it means that*x[t]*is the fifth smallest element of`x`

). The result is then phase randomized and its rank (*r*) is then calculated. The surrogate is then created by rank ordering`x`

using*r*. Null hypothesis: the observed time series is a monotonic nonlinear transformation of a Gaussian process. Side effect: the amplitude distribution (histogram) of the surrogate and original time series are the same.- ce
The circulant embedding technique is based upon generating surrogates whose estimated SDF (e.g., a periodogram) is not constrained to be the same as that of the original series (for references for details).

- dh
The Davison-Hinkley technique is based upon generating surrogates by randomizing both the phases and the amplitudes in the frequency domain followed by an inversion back to the time domain.

### Value

an object of class `surrogate`

.

### S3 METHODS

- plot
plots the surrogate data realizations. The following options may be used to adjust the plot components:

- show.
A character string defining the data to display. Choices are

`"series"`

,`"surrogate"`

, or`"both"`

for plots corresponding to the original series, surrogate series, or both, respectively. Default:`"surrogate"`

.- type
Character string denoting the type of data to plot. Options are

`"time"`

for time history,`"sdf"`

for a multitaper spectral density function estimation,`"pdf"`

for a probability density function estimation, and`"lag"`

for a two-dimensional embedding (lag plot. Default:`"time"`

.- stack
A logical flag. If

`TRUE`

, the`stackPlot`

function is called as opposed to the default plot function. As`stackPlot`

requires a common abscissa, this option is only available for`type="time"`

(time history) or`type="sdf"`

(spectral density function plot). Default:`TRUE`

.- xlab
Character string denoting the x-axis label for the

`"time"`

and`"sdf"`

`"pdf"`

types. Default: "Time", the series name, and "Frequency (Hz)", respectively.- ylab
Character string denoting the y-axis label for the

`"time"`

style. Default: the series name.- cex
Character expansion factor (same as the

`cex`

argument of the`par`

function). Default:`1`

.- adj.main
Title adjustment ala the

`adj`

argument of the`par`

function). Default:`1`

.- line.main
Line spacing for title ala the

`line`

argument of the`text`

function). Default:`0.5`

.- col.series
A character string or integer denoting the color to use when plotting data corresponding to the original series. See the

`colors`

function for more details. Default:`"black"`

.- col.surrogate
A character string or integer denoting the color to use when plotting data corresponding to the surrogate series. See the

`colors`

function for more details. Default:`"red"`

.- ...
Additional plot arguments (set internally by the

`par`

function).

prints a summary of the surrogate data realization. Available options are:

- ...
Additional print arguments used by the standard

`print`

function.

### References

J. Theiler and S. Eubank and A. Longtin and B. Galdrikian and J.D. Farmer (1992),
Testing for nonlinearity in time series: the method of surrogate data,
*Physica D: Nonlinear Phenomena*, **58**, 77–94.

Davies,R.B.and Harte,D.S.(1987). Tests for the Hurst
effect, *Biometrika*, **74**, 95–102.

D.B. Percival and W.L.B. Constantine (2002),
Exact Simulation of Gaussian Time Series from Nonparametric
Spectral Estimates with Application to Bootstrapping,
*Statistics and Computing*, under review.

D.B. Percival and A. Walden (1993),
*Spectral Analysis for Physical Applications: Multitaper
and Conventional Univariate Techniques*,
Cambridge University Press, Cambridge, UK.

D. B. Percival, S. Sardy and A. C. Davison,
*Wavestrapping Time Series: Adaptive Wavelet-Based Bootstrapping*,
in W. J. Fitzgerald, R. L. Smith, A. T. Walden and P. C. Young (Eds.),
*Nonlinear and Nonstationary Signal Processing*,
Cambridge, England: Cambridge University Press, 2001.

D.T. Kaplan (1995), Nonlinearity and Nonstationarity: The Use of Surrogate
Data in Interpreting Fluctuations in Heart Rate, *Proceedings of the 3rd Annual
Workshop on Computer Applications of Blood Pressure and Heart Rate Signals*,
Florence, Italy, 4–5 May.

### See Also

`infoDim`

, `corrDim`

.

### Examples

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 | ```
## create surrogate data sets using circulant
## embedding method
surr <- surrogate(beamchaos, method="ce")
## print the result
print(surr)
## plot and compare various statistics of the
## surrogate and original time series
plot(surr, type="time")
plot(surr, type="sdf")
plot(surr, type="lag")
plot(surr, type="pdf")
## create comparison time history
plot(surr, show="both", type="time")
``` |