library(knitr) opts_chunk$set(cache = TRUE)

The **ForeCA** R package implements forecastable compoenent analysis (ForeCA).

```
library(ForeCA)
```

If you don't know about ForeCA yet, the next section gives a quick overview. If you know ForeCA already you can skip it and go straight to the examples.

Forecastable component analysis (ForeCA) is a novel dimension reduction (DR)
technique for multivariate time series. Similar to other DR methods **ForeCA**
tries to find an "**interesting**" linear combination $y_t = \mathbf{w}' X_t$ of
a multivariate time series $\mathbf{X}_t$. For principal component analysis
(PCA) *high variance* data is interesting; for slow feature analysis (SFA)
*slow* signals are interesting. **ForeCA** tries to find linear combinations
that are easy to forecast, i.e., **forecastable** -- hence the name.

The measure of forecastability, denoted as $\Omega(y_t)$, is based on the spectral entropy of $y_t$, i.e., the entropy of the spectral density $f_y(\lambda)$: high entropy means low forecastability; low entropy signals are easy to forecast.

For more details see the original ForeCA paper:

citation("ForeCA")

ForeCA uses two main estimation techniques:

- spectral density estimation of multivariate (or univariate) time series $\mathbf{X}_t$ (or $y_t$);
- Shannon entropy estimation of a univariate probability mass function.

The **ForeCA** package achieves this via

**astsa**package for good non-parametric estimates of the spectrum, and- simple plugin estimates of Shannon entropy
(
`discrete_entropy`

/`continuous_entropy`

) with optional prior smoothing.

In this package details of those two estimator are specified via
`spectrum.control`

and `entropy.control`

arguments (lists). For sake of
simplicity let's fix them here and use those settings all throughout the
examples.

# spectrum control sc <- list(method = "mvspec") # entropy control ec <- list(prior.weight = 1e-2)

For more options and detailed explanations see `?complete-controls`

.

data("EuStockMarkets")

The `EuStockMarkets`

data contains daily closing prices for
`r ncol(EuStockMarkets)`

major European stock markets. As usual we convert this to
log-returns to make the time series stationary.

# log-returns in % ret <- diff(log(EuStockMarkets)) * 100

`ret`

contains time series of `r ncol(ret)`

major European stock indices (see
`?EuStockMarkets`

for details).

```
plot(ret)
```

cor.ret <- cor(ret) kable(format(cor.ret, digits = 2), caption = "Correlation matrix") kable(format(solve(cor.ret), digits = 2), caption = "Conditional covariance given other variables")

The correlation matrix shows that they are highly correlated with each other and the partial autocorrelation function (PACF) and spectra show that they are slightly autocorrelated.

ret.spec <- mvspectrum(ret, method = sc$method) plot(ret.spec) layout(matrix(seq_len(ncol(ret)), ncol = 2)) for (nn in colnames(ret)) { pacf(ret[, nn], main = nn) }

Not surprisingly the PACF shows only very small partial correlations, since these are stock market return and we should not expect see too large correlations over time (cf. "no arbitrage" hypothesis).

More specifically, we can estimate ForeCA measure of forecastability, $\Omega$, for each series:

ret.omega <- Omega(ret, spectrum.control = sc, entropy.control = ec) ret.omega

According to the estimates `r names(which.min(ret.omega))`

is the least
forecastable, `r names(which.max(ret.omega))`

is the most forecastable stock
market.

However, we can ask if there are linear combinations of stock markets, i.e., a
*portfolio*, that are even easier to forecast. That's exactly what ForeCA is
doing.

The main function in the package is `foreca()`

, which should be straightforward
to use as it resembles `princomp`

for PCA or `fastICA`

for independent component
anlaysis (ICA). In the basic setting users only have to pass the multivariate
time series and the number of components (`n.comp`

-- by default it uses `2`

).
We also specify `spectrum.control`

and `entropy.control`

but this is optional.

mod.foreca.ret <- foreca(ret, n.comp = ncol(ret), spectrum.control = sc, entropy.control = ec)

```
mod.foreca.ret # this uses the print.foreca method
```

For ease of use and better integration with existing methods in R, the returned
`foreca`

objects are very similar to `princomp`

objects, i.e., they have
`$scores`

and `$loadings`

(and many other useful metrics).

str(mod.foreca.ret) # too much to display; try it out!

The console print out showed that ForeCA indeed worked and it could find linear combinations that were more forecastable than any of the original series. The $\Omega$ score of the forecastable components (ForeCs) are

```
mod.foreca.ret$Omega
```

Recall that the most forecastable original time series had $\hat{\Omega} =
`r round(max(ret.omega), 2)`

$ (use `summary(mod.foreca.ret)$Omega.orig)`

to get
them). The loadings of the ForeCs tell us what this forecastable portfolio looks like:

```
mod.foreca.ret$loadings
```

The **ForeCA** package implements several `S3`

methods for objects of class
`foreca`

so that results can be quickly visualized and easily interpreted.

plot(mod.foreca.ret) plot(mod.foreca.ret$scores)

By design of ForeCA, the returned series are uncorrelated and have zero mean and unit variance.

round(colMeans(mod.foreca.ret$scores), 3) kable(round(cov(mod.foreca.ret$scores), digits = 3))

```
sessionInfo()
```

**Any scripts or data that you put into this service are public.**

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.