knitr::opts_chunk$set(echo = TRUE)
library(ggplot2)
library(slide)
# note as of the time of this writing slide is not on CRAN 
# devtools::install_github('https://github.com/DavisVaughan/slide')

Summary

The absorption ratio is a measure of systemic risk in the market. It is useful in an equity or multi-asset portfolio risk monitoring system. In this post I will show how to calculate and interpret the measure for the U.S. stock market.

Background

The absorption ratio was introduced by Kritzman and others in the 2010 paper, Principal Components as a Measure of Systemic Risk. The authors described the ratio as an implied measure of systemic risk that captures the extent to which markets are unified or tightly coupled. See https://papers.ssrn.com/sol3/papers.cfm?abstract_id=1582687 for the full paper.

The basic idea is to use building blocks of sub-assets or strategies to see how tightly a system of assets is trading. In this post we'll use U.S. sectors as the building blocks to examine the U.S. stock market. More specifically, we'll examine linear combinations of U.S. sectors that explain the variance of the U.S. stock market. Principal Component Analysis solves this objective iteratively. PCA finds the linear combination of sectors that explain the most amount of the market's variance. Then it finds the next linear combination of sectors that explain the most amount of left over variance (that is not explained by the first combination of sectors). This process will be repeated until all of the variance has been explained. The absorption ratio gets its name from the idea of calculating how much of the market's variance is "absorbed" by the first two combinations of sectors.

Gather data

The first step in calculating the ratio is to get and clean data. In this case our data are a time-series of U.S. sector returns. This post uses the 10 industry time-series from Ken French's data library which contains daily returns for 10 industries dating back to 1926. The function below downloads and cleans these return time-series. We'll assign our sector time series to the data.frame named ret.

download_french <- function(data_url, skip) {

  tmp <- tempfile()
  base_url <- 'http://mba.tuck.dartmouth.edu/pages/faculty/ken.french/ftp/'
  full_url <- paste0(base_url, data_url)
  download.file(full_url, destfile = tmp)
  uzfile <- unzip(tmp)
  clean_french_ret <- function(x) {
    x <- gsub('  ', '', x)
    x <- gsub(' ', '', x)
    x <- as.numeric(x)
    x[x >= 99.99] <- NA
    x[x <= -99.99] <- NA
    x / 100
  }
  dat <- read.csv(uzfile, skip = skip)
  dat$X <- as.Date(as.character(dat$X), format = '%Y%m%d')
  dat[, 2:ncol(dat)] <- apply(dat[, 2:ncol(dat), drop = FALSE], 2, 
                              clean_french_ret)
  colnames(dat)[1] <- 'date'
  date_missing <- is.na(dat$date)
  last_ret <- min(which(date_missing, FALSE)) - 1
  dat_out <- dat[1:last_ret, ]
  return(dat_out)
}
ret <- download_french('10_Industry_Portfolios_daily_CSV.zip', 9)
head(ret)
summary(ret)

Calculating the absorption ratio

According to the paper, there are two steps to calculating a meaningful absorption ratio. The first is to calculate the ratio and the second involves finding the standardized change of the ratio over time. The standardized change is essentially a z-score to allow us to detect high outliers of change.

Formally, the ratio is defined by the variance absorbed by the first n eigenvectors.

$$ \frac{\sum_{i = 1}^n{\sigma^2E_i}}{\sum_{j = 1}^j\sigma^2A_j} $$ The numerator equals the cumulative portion explained by the first n eigenvectors (in this case linear combinations of sectors). And the denominator equals the total variance explained by all the assets (again, in this case sectors). The paper suggests to set n to 1/5 of the total assets. We have 10 sectors so n will be 2. There are a variety of methods to find the significant stopping point for eigenvectors in PCA which I plan to write about in another post, for now we'll stick with 2.

To calculate the ratio we can use the base R function svd to compute the singular value decomposition of our matrix of returns ret. We can simplify the function above with the singular values (i.e., eigenvalues) from the SVD, the ratio is equivilant to the first n eigenvalues divided by the sum of all the eigenvalues.

Before writing the function let's look at the iterative process of explaining variance.

cov_mat <- cov(ret[, 2:ncol(ret)])
s <- svd(cov_mat)
plot_df <- data.frame(eigen = 1:10, var_expl = cumsum(s$d) / sum(s$d), 
                 fill = LETTERS[1:10])
ggplot(plot_df, aes(x = eigen, y = var_expl, fill = fill)) +
  geom_bar(stat = 'identity', position = 'dodge') +
  xlab('Eigenvalues') +
  ylab('Cumulative Variance Explained') +
  scale_y_continuous(lab = scales::percent) +
  scale_x_continuous(breaks = 1:10) +
  scale_fill_manual(values = c('darkgrey', 'indianred', rep('darkgrey', 8))) +
  annotate('text', x = 2, y = 0.9, 
           label = paste('Absorption', 'Ratio', sep = '\n')) +
  theme_bw() +
  theme(legend.position = 'none')

We can see the first combination of sectors explains around 75% of the total variance. This reading makes sense: the U.S. sectors are mostly going to trade together because they're driven by the same underlying factors. The second combination of sectors adds to our explanation of total risk pushing it over 80%. This is our absorption ratio. Notice how the cumulative variance explained levels off as we move left to right. This matches an economic intuition of the U.S. stock market. One dynamic (e.g., the market beta) is going to drive most of the system's risk. Additional breakdowns such as growth v.s. value and defensive v.s. cyclical can add to our understanding of the total variance (albeit not as much as the market beta). However, after a certain point any additional attempts to explain risk will not be useful.

To calculate the ratio we'll go through the same SVD process and cap our numerator at the first two eigenvalues.

The absorb_ratio function will take the covariance matrix of the returns, cov_mat, as the input and will return the ratio.

absorp_ratio <- function(cov_mat) {

  if (is.null(cov_mat)) {
    return(NULL)
  }
  s <- svd(cov_mat)
  sum(s$d[1:2]) / sum(s$d)
}

The paper uses an exponentially weighed covariance estimation with half-life decay. I prefer to use a weighted average estimation which accomplishes the same goal of placing more emphasis on recent returns while taking less time to calculate than EWMA. We'll write a separate function to handle covariance estimation, cov_wgt_avg, to pass as an input to the absorp_ratio function.

cov_wgt_avg <- function(ret) {

  ret_ordered <- ret[order(ret$date, decreasing = TRUE), 2:ncol(ret)]
  num_cols <- 2:ncol(ret)
  cov_short <- cov(ret[1:21, num_cols]) * 0.7
  cov_med <- cov(ret[22:126, num_cols]) * 0.15
  cov_long <- cov(ret[127:504, num_cols]) * 0.15
  cov_short + cov_med + cov_long
}

The final piece we need before we start rolling our absorption ratio is a utility function to calculate the standardized change in the ratio over time. We'll follow the paper's convention of using 15 days over 1 year.

zscore_win <- function(x) {
  (mean(x[1:15]) - mean(x[1:252])) / sd(x[1:252])
}

Now we can begin applying our functions over the time-series on a rolling basis. The slide package does a great job of setting up row-wise iteration over data.frame and data.frame-like structures. We'll use slide to boost the speed of our rolling calculation compared to writing a for loop in R.

roll_absorp <- function(ret, roll_win = 504) {

  roll_cov <- slide(ret, ~cov_wgt_avg(.x), .before = roll_win, .complete = TRUE)
  roll_ar <- slide(roll_cov, ~absorp_ratio(.x[[1]]), .before = 1)
  roll_ar_delta <- slide(unlist(roll_ar), ~zscore_win(.x), .before = roll_win,
                         .complete = TRUE)
  roll_ar_delta_vec <- unlist(roll_ar_delta)
  res <- data.frame(date = ret$date, 
                    absorp.ratio = c(rep(NA, 1 + roll_win), unlist(roll_ar)),
                    absorp.delta = c(rep(NA, 1 + roll_win + 504), 
                                     roll_ar_delta_vec))
}

res <- roll_absorp(ret)

To examine the ratio over time we'll plot by decade and highlight when the standardized change in the absorption ratio crosses two.

plotdf <- res[, c(1, 3)]
as_decade <- function(x) {
  ychar <- format(x, '%Y')
  as.numeric(substr(ychar, 3, 3)) * 10
}
plotdf$decade <- as_decade(plotdf$date)
plotdf$decade <- factor(plotdf$decade, unique(plotdf$decade))
ggplot(plotdf, aes(x = date, y = absorp.delta)) +
  geom_line() +
  facet_wrap(.~ decade, scales = 'free_x') +
  geom_hline(yintercept = 2, col = 'red') +
  scale_x_date(date_breaks = '2 years', date_labels = '%Y') +
  theme(axis.text.x = element_text(angle = 45))

The ratio is far from perfect. There are false positives throughout the time-series, which is to be expected, the authors point out that a high ratio doesn't necessary mean a market crash will follow but instead indicates that conditions are ripe for a shock to propagate quickly. Of notable recent crashes, the time-series crosses two in 1987, 2000, and 2008.

The U.S. sector returns we used are an example of building blocks. We could instead use sector ETFs for less latent data or consider countries for the global equity market. We could also compare multiple portfolios to see which has more diversification potential (i.e., a lower absorption ratio through time).



alejandro-sotolongo/InvestmentSuite documentation built on Jan. 19, 2020, 5:20 p.m.