# corrDim: Correlation dimension In fractal: A Fractal Time Series Modeling and Analysis Package

## Description

Estimates the correlation dimension by forming a delay embedding of a time series, calculating correlation summation curves (one per embedding dimension), and subsequently fitting the slopes of these curves on a log-log scale using a robust linear regression model. If the slopes converge at a given embedding dimension E, then E is the correct embedding dimension and the (convergent) slope value is an estimate of the correlation dimension for the data.

## Usage

 ```1 2``` ```corrDim(x, dimension=5, tlag=timeLag(x, method="acfdecor"), olag=0, resolution=2) ```

## Arguments

 `x` a vector containing a uniformly-sampled real-valued time series or a matrix containing an embedding with each column representing a different coordinate. If the latter, the `dimension` input is set to the number of columns and the `tlag` input is ignored. `dimension` the maximal embedding dimension. Default: `5`. `olag` the number of points along the trajectory of the current point that must be exceeded in order for another point in the phase space to be considered a neighbor candidate. This argument is used to help attenuate temporal correlation in the the embedding which can lead to spuriously low correlation dimension estimates. The orbital lag must be positive or zero. Default: `length(x)/10` or `500`, whichever is smaller. `resolution` an integer representing the spatial resolution factor. A value of P increases the number of effective scales by a factor of P at a cost of raising the L-infinity norm to the Pth power. For example, setting the resolution to 2 will double the number of scales while imposing and additional multiplication operation. The resolution must exceed unity. Default: `2`. `tlag` the time delay between coordinates. Default: `timeLag(x, method="acfdecor")`, the decorrelation time of the autocorrelation function.

## Details

To estimate the correlation dimension, correlation summation curves must be generated and subsequently fit with a robust linear regression model to obtain the slopes of these curves on a log-log plot. The dimension at which these slope estimates (appear to) converge reveals the proper embedding dimension for the data and the slope at this (and higher) embedding dimensions is an estimate of the correlation dimension. The function used to fit the correlation summation curves is `lmsreg` which fits a robust linear model to the data using the method of least median of squares regression. See the on-line help documentation for help on the `lmsreg` function: in R, `lmsreg` is found in the `MASS` package while in S-PLUS it is indigenous and appears in the `splus` database.

The correlation summation at scale eps for a given embedding dimension is defined as

C2(eps)=2 / (N - gamma)/(N - gamma - 1) * sum{i=1:N}sum{j=i+1+gamma:N} H(eps - || Xi - Xj ||),

where H is the Heavyside function

H(x)=0 if x <= 0 and H(x)=1 for x > 1,

and Xi is the ith point of a collection of `N` points in the phase space. The parameter gamma is the orbital lag.

The algorithm used to calculate the correlation summation is made computationally efficient by using:

1

The L-infinity norm to calculate the distance between neighbors in the phase space as opposed to (say) the L2 norm which involves taking computationally intense square root and power of two operations. The L-infinity norm of the distance between two points in the phase space is the absolute value of the maximal difference between any of the points' respective coordinates, i.e. if X ={z1, z2, z3} then ||X|| sub infinity=max{i}(zi).

2

Bitwise masking and shift operations to reveal the radix-2 exponent of the L-infinity norm. This direct means of obtaining the exponent immediately yields the associated scale of the distance between neighbors in the phase space while avoiding costly log operations. The bitwise mask and shift factors are based on the IEEE standard 754 for binary floating-point arithmetic. Initial tests are performed in the code to verify that the current machine follows this standard.

3

a computationally efficient routine to calculate the resulting value of a float raised to a positive integer power. Specifically, the L-infinity norm is raised to an integer power (`p`) to effectively increase the spatial resolution by a factor of `p`.

The correlation summation curves C2(E,eps) where `E` is the embedding dimension and eps is the scale, the correlation dimension curves D2(E,eps) can be calculated by

D2(E,eps)=log2 (C2(E,2*eps) / C2(E,eps / 2)) / 2

This formulation is used to help suppress numerical instabilities that are present in other numerical derivative schemes such as a first order difference.

As a caveat to the user, the slope estimates of the correlation summation curves will typically display a fair amount of variability and the range of scales over which the slopes are approximately linear may be small. Inasmuch, the correlation dimension estimate should always be interpretted as a subjective summary statistic, even when the original times series is representative of a truly noise-free chaotic response.

## Value

an object of class `chaoticInvariant`.

## S3 METHODS

eda.plot

plots an extended data analysis plot, which graphically summarizes the process of obtaining a correlation dimension estimate. A time history, phase plane embeddding, correlation summation curves, and the slopes of correlation summation curves as a function of scale are plotted.

plot

plots the correlation summation curves on a log-log scale. The following options may be used to adjust the plot components:

type

Character string denoting the type of data to be plotted. The `"stat"` option plots the correlation summation curves while the `"dstat"` option plots a 3-point estimate of the derivatives of the correlation summation curves. The `"slope"` option plots the estimated slope of the correlation summation curves as a function of embedding dimension. Default: `"stat"`.

fit

Logical flag. If `TRUE`, a regression line is overlaid for each curve. Default: `TRUE`.

grid

Logical flag. If `TRUE`, a grid is overlaid on the plot. Default: `TRUE`.

legend

Logical flag. If `TRUE`, a legend of the estimated slopes as a function of embedding dimension is displayed. Default: `TRUE`.

...

Additional plot arguments (set internally by the `par` function).

print

prints a qualitiative summary of the results.

## References

Peter Grassberger and Itamar Procaccia (1983), Measuring the strangeness of strange attractors, Physica D, 9, 189–208.

Holger Kantz and Thomas Schreiber (1997), Nonlinear Time Series Analysis, Cambridge University Press.

Peter Grassberger and Itamar Procaccia (1983), Characterization of strange attractors, Physical Review Letters, 50(5), 346–349.

Rousseeuw, P. J. (1984). Least median of squares regression. Journal of the American Statistical Association, 79, 871–88.

`infoDim`, `embedSeries`, `timeLag`, `chaoticInvariant`, `lyapunov`, `poincareMap`, `spaceTime`, `findNeighbors`, `determinism`.

## Examples

 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15``` ```## calculate the correlation dimension estimates ## for chaotic beam data using a delay ## embedding for dimensions 1 through 10, a ## orbital lag of 10, and a spatial resolution ## of 4. beam.d2 <- corrDim(beamchaos, olag=10, dim=10, res=4) ## print a summary of the results print(beam.d2) ## plot the correlation summation curves plot(beam.d2, fit=FALSE, legend=FALSE) ## plot an extended data analysis plot eda.plot(beam.d2) ```

### Example output

```Loading required package: splus2R