Surrogate data generation


This function can be used to generate surrogate time series via various frequency domain bootstrapping techniques. Bootstrapping has been used (in the statistics community) to assess the sampling variability of certain statistics. The nonlinear dynamics community typically uses bootstrapping to detect nonlinear structure in stationary time series. Given a time series, this function is used to generate surrogate series via Theiler's Amplitude Adjusted Fourier Transform (AAFT), Theiler's phase randomization, Davies and Harte's Circulant Embedding (CE) technique, or Davison and Hinkley's (DH) phase and amplitude randomization technique.

Theiler's techniques produce so-called constrained realizations since some statistical aspect of the original data preserved (the histogram for the AAFT and the periodogram for the phase randomization). The other techniques, ciruclant embedding and Davison-Hinkley, are non-constrained as both the amplitudes and phases of the original series are randomized.


surrogate(x, method="ce", sdf=NULL, seed=0)



a vector containing a uniformly-sampled real-valued time series.


a character string representing the method to be used to generate surrogate data. Choices are:


Theiler's Amplitude Adjusted Fourier Transform.


Theiler's phase randomization.


Davies and Harte's Circulant Embedding.


Davison and Hinkley's phase and amplitude randomization.

Default: "ce".


an object of class SDF, containing a single-sided spectral density function estimation (corresponding to the original data) over normalized frequencies f(k)=k/(2N) for k=0,...,N where N is the number of samples in the original time series. This argument is only used for the circulant embedding method. Default: NULL unless the circulant embedding method is used, and then it is sapa::SDF(x, method="multitaper", recenter=TRUE, taper=h, single.sided=T) where h = taper(type="sine", n.sample=N, n.taper=5, norm=TRUE).


a positive integer representing the initial seed value to use for the random number generator. If seed=0, the current time is used as a means of generating a (unique) seed value. Otherwise, the specified seed value is used. Default: 0.


The algorithms are detailed as follows:


The discrete Fourier transform of a time series is calculated and the phase at each frequency is randomized to be uniformly distributed on [0, 2*PI]. Phase symmetry is preserved so that an inverse DFT forms a purely real surrogate. Null hypothesis: the original data come from a linear Gaussian process. Side effect: the periodogram of the surrogate and original time series are the same.


An N-point normally distributed realization of a white noise process is created, where N is the length of x, and sorted to have the same rank as x (e.g., if rank(x[t]) = 5 it means that x[t] is the fifth smallest element of x). The result is then phase randomized and its rank (r) is then calculated. The surrogate is then created by rank ordering x using r. Null hypothesis: the observed time series is a monotonic nonlinear transformation of a Gaussian process. Side effect: the amplitude distribution (histogram) of the surrogate and original time series are the same.


The circulant embedding technique is based upon generating surrogates whose estimated SDF (e.g., a periodogram) is not constrained to be the same as that of the original series (for references for details).


The Davison-Hinkley technique is based upon generating surrogates by randomizing both the phases and the amplitudes in the frequency domain followed by an inversion back to the time domain.


an object of class surrogate.



plots the surrogate data realizations. The following options may be used to adjust the plot components:


A character string defining the data to display. Choices are "series", "surrogate", or "both" for plots corresponding to the original series, surrogate series, or both, respectively. Default: "surrogate".


Character string denoting the type of data to plot. Options are "time" for time history, "sdf" for a multitaper spectral density function estimation, "pdf" for a probability density function estimation, and "lag" for a two-dimensional embedding (lag plot. Default: "time".


A logical flag. If TRUE, the stackPlot function is called as opposed to the default plot function. As stackPlot requires a common abscissa, this option is only available for type="time" (time history) or type="sdf" (spectral density function plot). Default: TRUE.


Character string denoting the x-axis label for the "time" and "sdf" "pdf" types. Default: "Time", the series name, and "Frequency (Hz)", respectively.


Character string denoting the y-axis label for the "time" style. Default: the series name.


Character expansion factor (same as the cex argument of the par function). Default: 1.


Title adjustment ala the adj argument of the par function). Default: 1.


Line spacing for title ala the line argument of the text function). Default: 0.5.


A character string or integer denoting the color to use when plotting data corresponding to the original series. See the colors function for more details. Default: "black".


A character string or integer denoting the color to use when plotting data corresponding to the surrogate series. See the colors function for more details. Default: "red".


Additional plot arguments (set internally by the par function).


prints a summary of the surrogate data realization. Available options are:


Additional print arguments used by the standard print function.


J. Theiler and S. Eubank and A. Longtin and B. Galdrikian and J.D. Farmer (1992), Testing for nonlinearity in time series: the method of surrogate data, Physica D: Nonlinear Phenomena, 58, 77–94.

Davies,R.B.and Harte,D.S.(1987). Tests for the Hurst effect, Biometrika, 74, 95–102.

D.B. Percival and W.L.B. Constantine (2002), Exact Simulation of Gaussian Time Series from Nonparametric Spectral Estimates with Application to Bootstrapping, Statistics and Computing, under review.

D.B. Percival and A. Walden (1993), Spectral Analysis for Physical Applications: Multitaper and Conventional Univariate Techniques, Cambridge University Press, Cambridge, UK.

D. B. Percival, S. Sardy and A. C. Davison, Wavestrapping Time Series: Adaptive Wavelet-Based Bootstrapping, in W. J. Fitzgerald, R. L. Smith, A. T. Walden and P. C. Young (Eds.), Nonlinear and Nonstationary Signal Processing, Cambridge, England: Cambridge University Press, 2001.

D.T. Kaplan (1995), Nonlinearity and Nonstationarity: The Use of Surrogate Data in Interpreting Fluctuations in Heart Rate, Proceedings of the 3rd Annual Workshop on Computer Applications of Blood Pressure and Heart Rate Signals, Florence, Italy, 4–5 May.

See Also

infoDim, corrDim.


## create surrogate data sets using circulant 
## embedding method 
surr <- surrogate(beamchaos, method="ce")

## print the result 

## plot and compare various statistics of the 
## surrogate and original time series 
plot(surr, type="time")
plot(surr, type="sdf")
plot(surr, type="lag")
plot(surr, type="pdf")

## create comparison time history 
plot(surr, show="both", type="time")

Want to suggest features or report bugs for Use the GitHub issue tracker.

comments powered by Disqus