cross_correlate: Estimate cross correlation of unevenly sampled time series.
In svdataman/sour: Cross-correlation of time series (which may be unevenly sampled)

Description Usage Arguments Details Value Notes See Also Examples

cross_correlate returns cross correlation data for two times series.

cross_correlate(ts.1, ts.2, method = "iccf", max.lag = NULL, min.pts = 5,
  dtau = NULL, local.est = FALSE, zero.clip = NULL, use.errors = FALSE,
  one.way = FALSE, cov = FALSE, prob = 0.1, nsim = 0, peak.frac = 0.8,
  chatter = 0)

`ts.1, ts.2`	(array or dataframe) data for time series 1 and 2.
`method`	(string) use `"dcf"` or `"iccf"` (default).
`max.lag`	(float) maximum lag at which to compute the CCF.
`min.pts`	(integer) each DCF bin must contain at least `min.pts` correlation coefficients.
`dtau`	(float) spacing of the time delays (`tau`) which which CCF is estimated.
`local.est`	(logical) use 'local' (not 'global') means and variances?
`zero.clip`	(logical) remove pairs of points with exactly zero lag?
`use.errors`	(logical) if `TRUE` then subtract mean square error from variances.
`one.way`	(logical) (ICCF only) if TRUE then only interpolar time series 2.
`cov`	(logical) if `TRUE` then compute covariance, not correlation coefficient.
`prob`	(logical) probability level to use for confidence intervals
`nsim`	(integer) number of FR/RSS simulations to run
`peak.frac`	(float) only include CCF points above `peak.frac`*max(ccf) in centroid calculation.
`chatter`	(integer) set the level of feedback.

Function for estimating the cross-correlation between two time series which may be irregularly and/or non-simultaneously sampled. The CCF is computed using one of two methods: (1) the Discrete Correlation Function (DCF; Edelson & Krolik 1988) or (2) the Interpolated Cross Correlation Function (ICCF; Gaskell & Sparke 1986). You can also produce estimates of uncertainty on the CCF, its peak and centroid using the Flux Randomisation and Random Subsample Selection (FR/RSS) method of Peterson et al. (1998).

A list with components

`tau`	(array) A one dimensional array containing the lags at which the CCF is estimated.
`ccf`	(array) An array with the same dimensions as lag containing the estimated CCF.
`lower`	(array) Lower limit of CCF (see Notes).
`upper`	(array) Upper limit of CCF (see Notes).
`peak.dist`	(array) A array of length `nsim` containing the CCF peaks from the simulations.
`cent.dist`	(array) A array of length `nsim` containing the CCF centroids from the simulations.
`method`	(string) which method was used? `"iccf"` or `"dcf"`

The value of ccf[k] returns the estimated correlation between ts.1$y(t+tau) and ts.2$y(t) where tau = tau[k]. A strong peak at negative lags indicates the ts.1 leads ts.2.

If only one time series is given as input then the Auto-Correlation Function (ACF) is computed.

Input data frames: note that the input data ts.1 and ts.2 are not traditional R time series objects. Such objects are only suitable for regularly sampled data. These CCF functions are designed to work with data of arbitrary sampling, we therefore need to explicitly list times and values. The input objects are therefore data.frames with at least two columns which much be called t (time) and y (value). An error of the value may be provided by a dy column. Any other columns are ignored.

Local vs. global estimation: If local.est = FALSE (default) then the correlation coefficient is computed sing the 'global' mean and variance of each time series. If local.est = TRUE then the correlation coefficient is computed using the 'local' mean and variance. For each lag, the mean to be subtrated and the varaince to be divided are computed using only data points contributing to that lag.

Simulations: Performs "flux randomisation" and "random sample selection" of an input time series, following Peterson et al. (2004, ApJ, 613:682-699). Given an input data series (t, y, dy) of length N we sample N points with replacement. Duplicated points are ignored, so the ouptut is usually shorter than the input. So far this is a basic bootstrap procedure.

If error bars are provided: when a point is selected m times, we decrease the error by 1/sqrt(m). See Appendix A of Peterson et al. And after resampling in time, we then add a random Gaussian deviate to each remaining data point, with std.dev equal to its error bar. In this way both the times and values are randomised. If errors bars are not provided, this is a simple bootstrap.

Peak and centroid: from the simulations we record the CCF peak and its centroid. The centroid is the mean of tau*ccf/sum(ccf) including all points for which ccf is higher than peak.frac of max(ccf).

Upper/lower limits and distributions: If no simulations are used (nsim = 0) then upper/lower confidence limits on the CCF are estimated using the method of Barlett (1955) based on the two ACFs. If simulations are used, the confidence limits are based on the simulations.

ccf, fr_rss

 ## Example using NGC 5548 data
 result <- cross_correlate(cont, hbeta, dtau = 1, max.lag = 550)
 plot(result$tau, result$ccf, type = "l", bty = "n", xlab = "time delay", ylab = "CCF")
 grid()
 
 ## or using the DCF method
 result <- cross_correlate(cont, hbeta, method = "dcf", dtau = 5, max.lag = 350)
 lines(result$tau, result$ccf, col = "red")

 ## Examples from Venables & Ripley
 require(graphics)
 tsf <- data.frame(t = time(fdeaths), y = fdeaths)
 tsm <- data.frame(t = time(mdeaths), y = mdeaths)

 ## compute CCF using ICCF method
 result <- cross_correlate(tsm, tsf, method = "iccf")
 plot_ccf(result)

 ## compute CCF using standard method (stats package) and compare
 result.st <- ccf(mdeaths, fdeaths, plot = FALSE)
 lines(result.st$lag, result.st$acf, col="red", lwd = 3)