library(knitr) opts_chunk$set( collapse = TRUE, comment = "#>", fig.align = "center", fig.retina = 2, out.width = "75%", dpi = 96 ) knit_hooks$set(pngquant = hook_pngquant) #Help on bookdown: https://bookdown.org/yihui/bookdown/ #rmarkdown::render("vignettes/EfficientComputations-vignette.Rmd", "bookdown::html_document2")
This note is not intended for users of the package but for developers. It contains compact expressions for the risk, risk gradient, g, and Jacobian matrix of g for efficient implementation.
The risk-parity portfolio formulation is of the form [@FengPal2016monograph]: $$\begin{array}{ll} \underset{\mathbf{w}}{\textsf{minimize}} & R(\mathbf{w})\ \textsf{subject to} & \mathbf{1}^T\mathbf{w}=1\ & \mathbf{w}\ge\mathbf{0}, \end{array}$$ where the risk term is of the form (double summation) $R(\mathbf{w}) = \sum_{i,j=1}^{N}(g_{ij}(\mathbf{w}))^{2}$ or simply (single summation) $R(\mathbf{w}) = \sum_{i=1}^{N}(g_{i}(\mathbf{w}))^{2}$.
This problem can be solved directly with some nonlinear solver (for which we need to be able to compute the risk term $R(\mathbf{w})$ (even better if the gradient is computed) as well as with the Successive Convex Approximation (SCA) method developed in [@FengPal2015riskparity]. The algorithm iteratively solves a sequence of QP problems of the form: $$ \begin{array}{ll} \underset{\mathbf{w}}{\textsf{minimize}} & \tilde{U}\left(\mathbf{w},\mathbf{w}^{k}\right)=\frac{1}{2}\mathbf{w}^T\mathbf{Q}^{k}\mathbf{w}+\mathbf{w}^T\mathbf{q}^{k}+\lambda F\left(\mathbf{w}\right)\ \textsf{subject to} & \mathbf{1}^T\mathbf{w}=1\ & \mathbf{w}\ge\mathbf{0}, \end{array} $$ where $$\begin{aligned} \mathbf{g}\left(\mathbf{w}^{k}\right) & \triangleq \left[g_{1}\left(\mathbf{w}^{k}\right),\dots,g_{N}\left(\mathbf{w}^{k}\right)\right]^T\ \mathbf{A}^{k}\left(\mathbf{w}^{k}\right) &\triangleq \left[\nabla g_{1}\left(\mathbf{w}^{k}\right),\dots,\nabla g_{N}\left(\mathbf{w}^{k}\right)\right]^T,\ \mathbf{Q}^{k} &\triangleq 2\left(\mathbf{A}^{k}\right)^T\mathbf{A}^{k}+\tau\mathbf{I},\ \mathbf{q}^{k} &\triangleq 2\left(\mathbf{A}^{k}\right)^T\mathbf{g}\left(\mathbf{w}^{k}\right)-\mathbf{Q}^{k}\mathbf{w}^{k}. \end{aligned}$$ To effectively implement the SCA method we need efficient computation of the risk contribution terms contained in $\mathbf{g}(\mathbf{w})$ and their gradients contained in the Jacobian matrix $\mathbf{A}(\mathbf{w}) = \left[\nabla g_{1}(\mathbf{w}),\dots,\nabla g_{N}(\mathbf{w})\right]^T$.
Notation:
Sigma_w <- as.vector(Sigma %*% w) r <- w*Sigma_w
Consider now the risk expression: $$R(\mathbf{w}) = \sum_{i,j=1}^{N}\left(w_{i}\left(\boldsymbol{\Sigma}\mathbf{w}\right){i}-w{j}\left(\boldsymbol{\Sigma}\mathbf{w}\right){j}\right)^{2} = \sum{i,j=1}^{N}(r_i - r_j)^2 = 2N\sum_i r_i^2 - 2\left(\sum_i r_i\right)^2$$ which can be efficiently coded as
risk <- 2*N*sum(r^2) - 2*sum(r)^2
Let's compute now the gradient of $R(\mathbf{w})$:
v <- 4*(N*r - sum(r)) Ut <- Sigma*w + diag(Sigma_w) risk_grad <- t(Ut) %*% v
or
v <- 4*(N*r - sum(r)) risk_grad <- as.vector((Sigma %*% (w*v) + Sigma_w*v))
For SCA:
However, if we are interested in implenting the SCA method, this is not enough. Then we need an expression for the risk contributions contained in $\mathbf{g}$ as well as its Jacobian matrix $\mathbf{A} = \left[\nabla g_{11},\dots,\nabla g_{NN}\right]^T$.
The risk deviations are
$$g_{ij}(\mathbf{w})=w_i(\boldsymbol{\Sigma}\mathbf{w})_{i}-w_j(\boldsymbol{\Sigma}\mathbf{w})_j = r_i - r_j,$$
which can be efficiently coded as g <- rep(r, times = N) - rep(r, each = N)
. So another way to compute $R(\mathbf{w})$ is with sum(g^2)
, but it's not as efficient as the previous computation since $\mathbf{g}$ has $N^2$ elements.
Matrix $\mathbf{A}$ is more involved to compute. Using the M-notation [@FengPal2015riskparity]: $$\nabla g_{ij}(\mathbf{w}) = (\mathbf{M}i + \mathbf{M}_i^T - \mathbf{M}_j - \mathbf{M}_j^T)\mathbf{w}.$$ We can compute efficiently the vectors $\mathbf{u}_i = (\mathbf{M}_i + \mathbf{M}_i^T)\mathbf{w}$ for all $i=1,\ldots,N$ at once: $$\begin{aligned} \left[\mathbf{M}_1\mathbf{w}, \ldots, \mathbf{M}_N\mathbf{w}\right] & = \textsf{Diag}(\boldsymbol{\Sigma}\mathbf{w})\ \left[\mathbf{M}_1^T\mathbf{w}, \ldots, \mathbf{M}^T_N\mathbf{w}\right] & = \boldsymbol{\Sigma}\textsf{Diag}(\mathbf{w}), \end{aligned}$$ so that $\nabla g{ij} = \mathbf{u}_i - \mathbf{u}_j$, where $\mathbf{u}_i$ is the $i$-th column of the $N\times N$ matrix $$\mathbf{U} = \textsf{Diag}(\boldsymbol{\Sigma}\mathbf{w}) + \boldsymbol{\Sigma}\textsf{Diag}(\mathbf{w}).$$ Finally, to compute efficiently the Jacobian $\mathbf{A}$ without using loops we can write it as $$\mathbf{A} = \mathbf{1}_N \otimes \mathbf{U}^T - \mathbf{U}^T \otimes \mathbf{1}_N$$ which can be compactly coded as
Ut <- diag(Sigma_w) + Sigma*w A <- matrix(rep(t(Ut), N), ncol = N, byrow = TRUE) - matrix(rep(Ut, each = N), ncol = N)
Consider now the risk expression: $$R(\mathbf{w}) = \sum_{i,j=1}^{N}\left(\frac{w_i\left(\boldsymbol{\Sigma}\mathbf{w}\right)i}{b_i} - \frac{w_j\left(\boldsymbol{\Sigma}\mathbf{w}\right)_j}{b_j}\right)^{2} = \sum{i,j=1}^{N}\left(\frac{r_i}{b_i} - \frac{r_j}{b_j}\right)^2 = 2N\sum_i \left(\frac{r_i}{b_i}\right)^2 - 2\left(\sum_i \frac{r_i}{b_i}\right)^2$$ which can be efficiently coded as
rb <- r/b risk <- 2*N*sum(rb^2) - 2*sum(rb)^2
Let's compute now the gradient of $R(\mathbf{w})$:
$\frac{\partial R}{\partial r_i}=4\left(N r_i/b_i - \sum_j r_j/b_j\right)/b_i$ $\Longrightarrow$ $\nabla_{\mathbf{r}}R = 4(N(\mathbf{r}/\mathbf{b}) - (\mathbf{1}^T(\mathbf{r}/\mathbf{b})\mathbf{1})/\mathbf{b}$
$\nabla_{\mathbf{w}}R = (\textsf{J}{\mathbf{w}}\mathbf{r})^T \cdot \nabla{\mathbf{r}}R$ with $\textsf{J}_{\mathbf{w}}\mathbf{r} = \textsf{Diag}(\mathbf{w})\boldsymbol{\Sigma} + \textsf{Diag}(\boldsymbol{\Sigma}\mathbf{w})$:
$$\nabla_{\mathbf{w}}R = 4(\boldsymbol{\Sigma}\textsf{Diag}(\mathbf{w}) + \textsf{Diag}(\boldsymbol{\Sigma}\mathbf{w})) (N(\mathbf{r}/\mathbf{b}) - (\mathbf{1}^T(\mathbf{r}/\mathbf{b})\mathbf{1})/\mathbf{b},$$ which can be coded as
rb <- r/b v <- 4*(N*rb - sum(rb))/b Ut <- Sigma*w + diag(Sigma_w) risk_grad <- t(Ut) %*% v
or
rb <- r/b v <- 4*(N*rb - sum(rb))/b risk_grad <- as.vector(Sigma %*% (w*v) + Sigma_w*v)
For SCA:
Using the M-notation [@FengPal2015riskparity], we can write the risk deviations as
$$g_{ij}(\mathbf{w}) = \mathbf{w}^T\left(\frac{\mathbf{M}_i}{b_i} - \frac{\mathbf{M}_j}{b_j}\right)\mathbf{w} = r_i/b_i - r_j/b_j,$$
which can be efficiently coded as
rb <- r/b g <- rep(rb, times = N) - rep(rb, each = N)
The gradients can be written as [@FengPal2015riskparity]: $$\nabla g_{ij}(\mathbf{w}) = \left(\frac{\mathbf{M}i}{b_i} + \frac{\mathbf{M}_i^T}{b_i} - \frac{\mathbf{M}_j}{b_j} - \frac{\mathbf{M}_j^T}{b_j}\right)\mathbf{w}.$$ Recalling that $\mathbf{u}_i = (\mathbf{M}_i + \mathbf{M}_i^T)\mathbf{w}$ can be conveniently expressed as the $i$-th column of the $N\times N$ matrix $\mathbf{U} = \textsf{Diag}(\boldsymbol{\Sigma}\mathbf{w}) + \boldsymbol{\Sigma}\textsf{Diag}(\mathbf{w})$, then we can write $\nabla g{ij} = \mathbf{u}_i/b_i - \mathbf{u}_j/b_j$. Finally, to compute efficiently the Jacobian $\mathbf{A}$ without using loops we can write it as $$\mathbf{A} = \mathbf{1}_N \otimes \left(\mathbf{U}\textsf{Diag}(1/\mathbf{b})\right)^T - \left(\mathbf{U}\textsf{Diag}(1/\mathbf{b})\right)^T \otimes \mathbf{1}_N$$ which can be compactly coded as
Ut <- diag(Sigma_w) + Sigma*w Utb <- Ut / b A <- matrix(rep(t(Utb), N), ncol = N, byrow = TRUE) - matrix(rep(Utb, each = N), ncol = N)
Consider now the following risk expression with a single index: $$R(\mathbf{w}) = \sum_{i=1}^{N}\left(\frac{w_{i}\left(\boldsymbol{\Sigma}\mathbf{w}\right)i}{\mathbf{w}^T\boldsymbol{\Sigma}\mathbf{w}}-b_i\right)^{2} = \sum{i=1}^{N}\left(\frac{r_i}{\mathbf{1}^T\mathbf{r}}-b_i\right)^{2},$$ which can be efficiently coded as
risk <- sum((r/sum(r) - b)^2)
Let's compute now the gradient of $R(\mathbf{w})$:
$$\begin{aligned} \frac{\partial R}{\partial r_j} & = 2\sum_i\left(\frac{r_i}{\mathbf{1}^T\mathbf{r}}-b_i\right)\left(\frac{\delta_{ij}}{\mathbf{1}^T\mathbf{r}}-\frac{r_i}{(\mathbf{1}^T\mathbf{r})^2} \right)\ & = 2\sum_i\left(\frac{r_i}{\mathbf{1}^T\mathbf{r}}-b_i\right)\frac{\delta_{ij}}{\mathbf{1}^T\mathbf{r}} - 2\sum_i\left(\frac{r_i}{\mathbf{1}^T\mathbf{r}}-b_i\right)\frac{r_i}{(\mathbf{1}^T\mathbf{r})^2}\ & = \frac{2}{\mathbf{1}^T\mathbf{r}}\left(\left(\frac{r_j}{\mathbf{1}^T\mathbf{r}}-b_j\right) - \left(\frac{\mathbf{r}}{\mathbf{1}^T\mathbf{r}}-\mathbf{b}\right)^T\frac{\mathbf{r}}{\mathbf{1}^T\mathbf{r}}\right) \end{aligned}$$ so that $$\nabla_{\mathbf{r}}R = \frac{2}{\mathbf{1}^T\mathbf{r}}\left(\left(\frac{\mathbf{r}}{\mathbf{1}^T\mathbf{r}}-\mathbf{b}\right) - \mathbf{1}\cdot\left(\frac{\mathbf{r}}{\mathbf{1}^T\mathbf{r}}-\mathbf{b}\right)^T\frac{\mathbf{r}}{\mathbf{1}^T\mathbf{r}}\right)$$
$$\nabla_{\mathbf{w}}R = \frac{2}{\mathbf{1}^T\mathbf{r}}(\boldsymbol{\Sigma}\textsf{Diag}(\mathbf{w}) + \textsf{Diag}(\boldsymbol{\Sigma}\mathbf{w})) \left(\left(\frac{\mathbf{r}}{\mathbf{1}^T\mathbf{r}}-\mathbf{b}\right) - \mathbf{1}\cdot\left(\frac{\mathbf{r}}{\mathbf{1}^T\mathbf{r}}-\mathbf{b}\right)^T\frac{\mathbf{r}}{\mathbf{1}^T\mathbf{r}}\right),$$ which can be coded as
sum_r <- sum(r) r_sumr_b <- r/sum_r - b v <- (2/sum_r)*(r_sumr_b - sum(r_sumr_b*r)/sum_r) risk_grad <- as.vector(Sigma %*% (w*v) + Sigma_w*v)
For SCA:
Using the M-notation [@FengPal2015riskparity], we can write the risk deviations as
$$g_i(\mathbf{w}) = \frac{\mathbf{w}^T\mathbf{M}_i\mathbf{w}}{\mathbf{w}^T\boldsymbol{\Sigma}\mathbf{w}} - b_i = \frac{r_i}{\mathbf{1}^T\mathbf{r}}-b_i,$$
which can be efficiently coded as g <- r/sum(r) - b
.
The gradients can be written as [@FengPal2015riskparity]: $$\nabla g_{i}(\mathbf{w}) = \frac{(\mathbf{w}^T\boldsymbol{\Sigma}\mathbf{w})(\mathbf{M}i + \mathbf{M}_i^T)\mathbf{w} - (\mathbf{w}^T\mathbf{M}_i\mathbf{w})(2\boldsymbol{\Sigma})\mathbf{w}}{(\mathbf{w}^T\boldsymbol{\Sigma}\mathbf{w})^2}.$$ Recalling that $\mathbf{u}_i = (\mathbf{M}_i + \mathbf{M}_i^T)\mathbf{w}$ can be conveniently expressed as the $i$-th column of the $N\times N$ matrix $\mathbf{U} = \textsf{Diag}(\boldsymbol{\Sigma}\mathbf{w}) + \boldsymbol{\Sigma}\textsf{Diag}(\mathbf{w})$, then we can write $$\nabla g{i}(\mathbf{w}) = \frac{(\mathbf{1}^T\mathbf{r})\mathbf{u}_i - 2r_i\boldsymbol{\Sigma}\mathbf{w}}{(\mathbf{1}^T\mathbf{r})^2} = \frac{\mathbf{u}_i}{\mathbf{1}^T\mathbf{r}} - 2\frac{r_i}{(\mathbf{1}^T\mathbf{r})^2}\boldsymbol{\Sigma}\mathbf{w}$$ Finally, we can compute efficiently the Jacobian $\mathbf{A}$ without loops as $$\mathbf{A} = \frac{\mathbf{U}^T}{\mathbf{1}^T\mathbf{r}} - \frac{2}{(\mathbf{1}^T\mathbf{r})^2}\mathbf{r}\mathbf{w}^T\boldsymbol{\Sigma}$$ which can be compactly coded as
sum_r <- sum(r) Ut <- diag(Sigma_w) + Sigma*w A <- Ut/sum_r - 2/(sum_r^2) * r %o% Sigma_w
Consider now the following risk expression with a single index, which is a particular case of the previous one "rc-over-var-vs-b" setting $b_i=0$: $$R(\mathbf{w}) = \sum_{i=1}^{N}\left(\frac{w_{i}\left(\boldsymbol{\Sigma}\mathbf{w}\right)i}{\mathbf{w}^T\boldsymbol{\Sigma}\mathbf{w}}\right)^{2} = \sum{i=1}^{N}\left(\frac{r_i}{\mathbf{1}^T\mathbf{r}}\right)^{2},$$ which can be efficiently coded as
risk <- sum((r/sum(r))^2)
The gradient of the risk $R(\mathbf{w})$ is $$\nabla_{\mathbf{w}}R = \frac{2}{\mathbf{1}^T\mathbf{r}}(\boldsymbol{\Sigma}\textsf{Diag}(\mathbf{w}) + \textsf{Diag}(\boldsymbol{\Sigma}\mathbf{w})) \left(\frac{\mathbf{r}}{\mathbf{1}^T\mathbf{r}} - \mathbf{1}\cdot\frac{\mathbf{r}^T\mathbf{r}}{(\mathbf{1}^T\mathbf{r})^2}\right),$$ which can be coded as
sum_r <- sum(r) r_sumr <- r/sum_r v <- (2/sum_r)*(r_sumr - sum(r_sumr^2)) risk_grad <- as.vector(Sigma %*% (w*v) + Sigma_w*v)
For SCA:
Using the M-notation [@FengPal2015riskparity], we can write the risk deviations as
$$g_i(\mathbf{w}) = \frac{\mathbf{w}^T\mathbf{M}_i\mathbf{w}}{\mathbf{w}^T\boldsymbol{\Sigma}\mathbf{w}} = \frac{r_i}{\mathbf{1}^T\mathbf{r}},$$
which can be efficiently coded as g <- r/sum(r)
.
The Jacobian $\mathbf{A}$ is the same as in the previous formulation "rc-over-var-vs-b".
Consider now the following risk expression: $$R(\mathbf{w}) = \sum_{i=1}^{N}\left(\frac{w_{i}\left(\boldsymbol{\Sigma}\mathbf{w}\right)i}{\sqrt{\mathbf{w}^T\boldsymbol{\Sigma}\mathbf{w}}}-b_i\sqrt{\mathbf{w}^T\boldsymbol{\Sigma}\mathbf{w}}\right)^{2} = \sum{i=1}^{N}\left(\frac{r_i}{\sqrt{\mathbf{1}^T\mathbf{r}}}-b_i\sqrt{\mathbf{1}^T\mathbf{r}}\right)^{2}$$ which can be efficiently coded as
sqrt_sum_r <- sqrt(sum(r)) risk <- sum((r/sqrt_sum_r - b*sqrt_sum_r)^2)
Let's compute now the gradient of $R(\mathbf{w})$:
we will use $\frac{\partial\sqrt{\mathbf{1}^T\mathbf{r}}}{\partial r_j} = \frac{1}{2\sqrt{\mathbf{1}^T\mathbf{r}}}$ and $\frac{\partial\left(1/\sqrt{\mathbf{1}^T\mathbf{r}}\right)}{\partial r_j} = \frac{-1}{2\left(\mathbf{1}^T\mathbf{r}\right)\sqrt{\mathbf{1}^T\mathbf{r}}}$
first, w.r.t. $\mathbf{r}$:
$$\begin{aligned} \frac{\partial R}{\partial r_j} & = 2\sum_i\left(\frac{r_i}{\sqrt{\mathbf{1}^T\mathbf{r}}}-b_i\sqrt{\mathbf{1}^T\mathbf{r}}\right)\left(\frac{\delta_{ij}}{\sqrt{\mathbf{1}^T\mathbf{r}}} - \frac{r_i}{\mathbf{1}^T\mathbf{r}}\frac{1}{2\sqrt{\mathbf{1}^T\mathbf{r}}} - b_i\frac{1}{2\sqrt{\mathbf{1}^T\mathbf{r}}}\right)\ & = 2\sum_i\left(\frac{r_i}{\sqrt{\mathbf{1}^T\mathbf{r}}}-b_i\sqrt{\mathbf{1}^T\mathbf{r}}\right)\frac{\delta_{ij}}{\sqrt{\mathbf{1}^T\mathbf{r}}} - 2\sum_i\left(\frac{r_i}{\sqrt{\mathbf{1}^T\mathbf{r}}}-b_i\sqrt{\mathbf{1}^T\mathbf{r}}\right)\left(\frac{r_i}{\mathbf{1}^T\mathbf{r}}-b_i\right)\frac{1}{2\sqrt{\mathbf{1}^T\mathbf{r}}}\ & = 2\sum_i\left(\frac{r_i}{\mathbf{1}^T\mathbf{r}}-b_i\right)\delta_{ij} - \sum_i\left(\frac{r_i}{\mathbf{1}^T\mathbf{r}}-b_i\right)\left(\frac{r_i}{\mathbf{1}^T\mathbf{r}}-b_i\right)\ & = 2\left(\frac{r_j}{\mathbf{1}^T\mathbf{r}}-b_j\right) - \left(\frac{\mathbf{r}}{\mathbf{1}^T\mathbf{r}}-\mathbf{b}\right)^T\left(\frac{\mathbf{r}}{\mathbf{1}^T\mathbf{r}}-\mathbf{b}\right) \end{aligned}$$ so that $$\nabla_{\mathbf{r}}R = 2\left(\frac{\mathbf{r}}{\mathbf{1}^T\mathbf{r}}-\mathbf{b}\right) - \mathbf{1}\cdot\left(\frac{\mathbf{r}}{\mathbf{1}^T\mathbf{r}}-\mathbf{b}\right)^T\left(\frac{\mathbf{r}}{\mathbf{1}^T\mathbf{r}}-\mathbf{b}\right)$$
$$\nabla_{\mathbf{w}}R = (\boldsymbol{\Sigma}\textsf{Diag}(\mathbf{w}) + \textsf{Diag}(\boldsymbol{\Sigma}\mathbf{w})) \left(2\left(\frac{\mathbf{r}}{\mathbf{1}^T\mathbf{r}}-\mathbf{b}\right) - \mathbf{1}\cdot\left(\frac{\mathbf{r}}{\mathbf{1}^T\mathbf{r}}-\mathbf{b}\right)^T\left(\frac{\mathbf{r}}{\mathbf{1}^T\mathbf{r}}-\mathbf{b}\right)\right),$$ which can be coded as
sum_r <- sum(r) r_sumr_b <- r/sum_r - b v <- 2*r_sumr_b - sum(r_sumr_b^2) risk_grad <- as.vector(Sigma %*% (w*v) + Sigma_w*v)
For SCA:
Using the M-notation [@FengPal2015riskparity], we can write the risk deviations as
$$
\begin{equation}
g_i(\mathbf{w}) = \frac{\mathbf{w}^T\mathbf{M}_i\mathbf{w}}{\sqrt{\mathbf{w}^T\boldsymbol{\Sigma}\mathbf{w}}} - b_i \sqrt{\mathbf{w}^T\boldsymbol{\Sigma}\mathbf{w}} = \frac{r_i}{\sqrt{\mathbf{1}^{T}\mathbf{r}}} - b_i \sqrt{\mathbf{1}^{T}\mathbf{r}}
\end{equation}
$$
which can be efficiently coded as
sum_r <- sum(r) sqrt_sum_r <- sqrt(sum_r) g <- r/sqrt_sum_r - b*sqrt_sum_r
The gradients can be written as [@FengPal2015riskparity]: $$\nabla g_{i}(\mathbf{w}) = \frac{(\mathbf{w}^T\boldsymbol{\Sigma}\mathbf{w})(\mathbf{M}_i + \mathbf{M}_i^T)\mathbf{w} - (\mathbf{w}^T\mathbf{M}_i\mathbf{w})\boldsymbol{\Sigma}\mathbf{w}}{(\mathbf{w}^T\boldsymbol{\Sigma}\mathbf{w})^{3/2}} - b_i\frac{\boldsymbol{\Sigma}\mathbf{w}}{\sqrt{\mathbf{w}^T\boldsymbol{\Sigma}\mathbf{w}}}.$$
Recalling that $\mathbf{u}i = (\mathbf{M}_i + \mathbf{M}_i^T)\mathbf{w}$ can be conveniently expressed as the $i$-th column of the $N\times N$ matrix $\mathbf{U} = \textsf{Diag}(\boldsymbol{\Sigma}\mathbf{w}) + \boldsymbol{\Sigma}\textsf{Diag}(\mathbf{w})$, then we can write $$\nabla g{i}(\mathbf{w}) = \frac{(\mathbf{1}^T\mathbf{r})\mathbf{u}_i - r_i\boldsymbol{\Sigma}\mathbf{w}}{(\mathbf{1}^T\mathbf{r})^{3/2}} - b_i\frac{\boldsymbol{\Sigma}\mathbf{w}}{\sqrt{\mathbf{1}^T\mathbf{r}}} = \frac{\mathbf{u}_i}{\sqrt{\mathbf{1}^T\mathbf{r}}} - \left(\frac{r_i}{(\mathbf{1}^T\mathbf{r})^{3/2}}+\frac{b_i}{\sqrt{\mathbf{1}^T\mathbf{r}}}\right)\boldsymbol{\Sigma}\mathbf{w}.$$ Finally, we can compute the Jacobian $\mathbf{A}$ compactly as $$\mathbf{A} = \left(\mathbf{U}^T - \left(\frac{\mathbf{r}}{\mathbf{1}^T\mathbf{r}}+\mathbf{b}\right)\mathbf{w}^T\boldsymbol{\Sigma}\right)/\sqrt{\mathbf{1}^T\mathbf{r}}$$ which can be compactly coded as
Ut <- diag(Sigma_w) + Sigma*w A <- (Ut - (r/sum_r + b) %o% Sigma_w) / sqrt(sum_r)
Consider now the following risk expression with a single index: $$R(\mathbf{w}) = \sum_{i=1}^{N}\left(w_{i}\left(\boldsymbol{\Sigma}\mathbf{w}\right)i - b_i\mathbf{w}^T\boldsymbol{\Sigma}\mathbf{w}\right)^{2} = \sum{i=1}^{N}\left(r_i-b_i\mathbf{1}^T\mathbf{r}\right)^{2},$$ which can be efficiently coded as
risk <- sum((r - b*sum(r))^2)
Let's compute now the gradient of $R(\mathbf{w})$:
first, w.r.t. $\mathbf{r}$: $$\frac{\partial R}{\partial r_j} = 2\left(r_j - b_j\mathbf{1}^T\mathbf{r} - \sum_ib_i(r_i - b_i\mathbf{1}^T\mathbf{r})\right)$$ so that $$\nabla_{\mathbf{r}}R = 2\left(\mathbf{r} - \mathbf{b}(\mathbf{1}^T\mathbf{r}) - \mathbf{1}\cdot(\mathbf{b}^T\mathbf{r}-\mathbf{b}^T\mathbf{b}(\mathbf{1}^T\mathbf{r}))\right)$$
$\nabla_{\mathbf{w}}R = (\textsf{J}{\mathbf{w}}\mathbf{r})^T \cdot \nabla{\mathbf{r}}R$ with $\textsf{J}_{\mathbf{w}}\mathbf{r} = \textsf{Diag}(\mathbf{w})\boldsymbol{\Sigma} + \textsf{Diag}(\boldsymbol{\Sigma}\mathbf{w})$:
$$\nabla_{\mathbf{w}}R = 2(\boldsymbol{\Sigma}\textsf{Diag}(\mathbf{w}) + \textsf{Diag}(\boldsymbol{\Sigma}\mathbf{w})) \left(\mathbf{r} - \mathbf{b}(\mathbf{1}^T\mathbf{r}) - \mathbf{1}\cdot(\mathbf{b}^T\mathbf{r}-\mathbf{b}^T\mathbf{b}(\mathbf{1}^T\mathbf{r}))\right),$$ which can be coded as
sum_r <- sum(r) v <- 2*(r - b*sum_r - sum(b*r) + sum(b^2)*sum_r) risk_grad <- as.vector(Sigma %*% (w*v) + Sigma_w*v)
For SCA:
Using the M-notation [@FengPal2015riskparity], we can write the risk deviations as
$$g_i(\mathbf{w}) = \mathbf{w}^T\left(\mathbf{M}_i - b_i\boldsymbol{\Sigma}\right)\mathbf{w} = r_i - b_i\mathbf{1}^{T}\mathbf{r}$$
which can be efficiently coded as
g <- r - b*sum(r)
The gradients can be written as [@FengPal2015riskparity]: $$\nabla g_{i}(\mathbf{w}) = \left(\mathbf{M}_i + \mathbf{M}_i^T - 2b_i\boldsymbol{\Sigma}\right)\mathbf{w}.$$
Recalling that $\mathbf{u}i = (\mathbf{M}_i + \mathbf{M}_i^T)\mathbf{w}$ can be conveniently expressed as the $i$-th column of the $N\times N$ matrix $\mathbf{U} = \textsf{Diag}(\boldsymbol{\Sigma}\mathbf{w}) + \boldsymbol{\Sigma}\textsf{Diag}(\mathbf{w})$, then we can write $$\nabla g{i}(\mathbf{w}) = \mathbf{u}_i - 2b_i\boldsymbol{\Sigma}\mathbf{w}.$$
Finally, we can compute the Jacobian $\mathbf{A}$ compactly as $$\mathbf{A} = \mathbf{U}^T - 2\mathbf{b}\mathbf{w}^T\boldsymbol{\Sigma}$$ which can be compactly coded as
Ut <- diag(Sigma_w) + Sigma*w A <- Ut - 2 * b %o% Sigma_w
Consider now the following risk expression: $$R(\mathbf{w},\theta) = \sum_{i=1}^{N}\left(w_{i}\left(\boldsymbol{\Sigma}\mathbf{w}\right)i - \theta \right)^{2} = \sum{i=1}^{N}\left(r_i - \theta\right)^{2},$$ where $\theta$ is also an optimization variable. This expression can be efficiently coded as
risk <- sum((r - theta)^2)
Let's compute now the gradient of $R(\mathbf{w})$:
first, w.r.t. $\theta$: $$\frac{\partial R}{\partial\theta} = -2\sum_{i=1}^{N}\left(r_i - \theta\right)$$
now, w.r.t. $\mathbf{r}$: $$\nabla_{\mathbf{r}}R = 2(\mathbf{r}-\theta\mathbf{1})$$
$\nabla_{\mathbf{w}}R = (\textsf{J}{\mathbf{w}}\mathbf{r})^T \cdot \nabla{\mathbf{r}}R$ with $\textsf{J}_{\mathbf{w}}\mathbf{r} = \textsf{Diag}(\mathbf{w})\boldsymbol{\Sigma} + \textsf{Diag}(\boldsymbol{\Sigma}\mathbf{w})$:
$$\nabla_{\mathbf{w}}R = 2(\boldsymbol{\Sigma}\textsf{Diag}(\mathbf{w}) + \textsf{Diag}(\boldsymbol{\Sigma}\mathbf{w})) (\mathbf{r}-\theta\mathbf{1}),$$ which can be coded as (note that we use the notation $\nabla R = \left[\begin{array}{c} \nabla_{\mathbf{w}}R\ \nabla_{\theta}R\end{array}\right]$)
v <- 2*(r - theta) risk_grad <- c(as.vector(Sigma %*% (w*v) + Sigma_w*v), -sum(v))
For SCA:
Using the M-notation [@FengPal2015riskparity], we can write the risk deviations as
$$
\begin{equation}
g_i(\mathbf{w},\theta) = \mathbf{w}^T\mathbf{M}_i\mathbf{w} - \theta = r_i - \theta
\end{equation}
$$
which can be efficiently coded as
g <- r - theta
The gradients can be written as [@FengPal2015riskparity]: $$\nabla_\mathbf{w} g_{i}(\mathbf{w},\theta) = (\mathbf{M}i + \mathbf{M}_i^T)\mathbf{w}$$ and $$\nabla\theta g_{i}(\mathbf{w},\theta) = -1.$$
Recalling that $\mathbf{u}i = (\mathbf{M}_i + \mathbf{M}_i^T)\mathbf{w}$ can be conveniently expressed as the $i$-th column of the $N\times N$ matrix $\mathbf{U} = \textsf{Diag}(\boldsymbol{\Sigma}\mathbf{w}) + \boldsymbol{\Sigma}\textsf{Diag}(\mathbf{w})$, then we can write $\nabla\mathbf{w} g_{i}(\mathbf{w}) = \mathbf{u}_i$. Finally, we can compute the Jacobian $\mathbf{A}$ compactly as $$\mathbf{A} = \left[\mathbf{U}^T, -\mathbf{1}_N\right]$$ which can be compactly coded as
Ut <- diag(Sigma_w) + Sigma*w A <- cbind(Ut, -1)
Consider now the following risk expression: $$R(\mathbf{w},\theta) = \sum_{i=1}^{N}\left(\frac{w_{i}\left(\boldsymbol{\Sigma}\mathbf{w}\right)i}{b_i} - \theta \right)^{2} = \sum{i=1}^{N}\left(r_i/b_i - \theta\right)^{2},$$ where $\theta$ is also an optimization variable. This expression can be efficiently coded as
risk <- sum((r/b - theta)^2)
Let's compute now the gradient of $R(\mathbf{w})$:
first, w.r.t. $\theta$: $$\frac{\partial R}{\partial\theta} = -2\sum_{i=1}^{N}\left(r_i/b_i - \theta\right)$$
now, w.r.t. $\mathbf{r}$: $$\nabla_{\mathbf{r}}R = 2(\mathbf{r}/\mathbf{b}-\theta\mathbf{1})/\mathbf{b}$$
$\nabla_{\mathbf{w}}R = (\textsf{J}{\mathbf{w}}\mathbf{r})^T \cdot \nabla{\mathbf{r}}R$ with $\textsf{J}_{\mathbf{w}}\mathbf{r} = \textsf{Diag}(\mathbf{w})\boldsymbol{\Sigma} + \textsf{Diag}(\boldsymbol{\Sigma}\mathbf{w})$:
$$\nabla_{\mathbf{w}}R = 2(\boldsymbol{\Sigma}\textsf{Diag}(\mathbf{w}) + \textsf{Diag}(\boldsymbol{\Sigma}\mathbf{w})) (\mathbf{r}/\mathbf{b}-\theta\mathbf{1})/\mathbf{b},$$ which can be coded as (note that we use the notation $\nabla R = \left[\begin{array}{c} \nabla_{\mathbf{w}}R\ \nabla_{\theta}R\end{array}\right]$)
v <- 2*(r/b - theta) vb <- v/b risk_grad <- c(as.vector(Sigma %*% (w*vb) + Sigma_w*vb), -sum(v))
For SCA:
Using the M-notation [@FengPal2015riskparity], we can write the risk deviations as
$$
\begin{equation}
g_i(\mathbf{w},\theta) = \frac{\mathbf{w}^T\mathbf{M}_i\mathbf{w}}{b_i} - \theta = r_i/b_i - \theta
\end{equation}
$$
which can be efficiently coded as
g <- r/b - theta
The gradients can be written as [@FengPal2015riskparity]: $$\nabla_\mathbf{w} g_{i}(\mathbf{w},\theta) = \left(\frac{\mathbf{M}i}{b_i} + \frac{\mathbf{M}_i^T}{b_i}\right)\mathbf{w}$$ and $$\nabla\theta g_{i}(\mathbf{w},\theta) = -1.$$
Recalling that $\mathbf{u}i = (\mathbf{M}_i + \mathbf{M}_i^T)\mathbf{w}$ can be conveniently expressed as the $i$-th column of the $N\times N$ matrix $\mathbf{U} = \textsf{Diag}(\boldsymbol{\Sigma}\mathbf{w}) + \boldsymbol{\Sigma}\textsf{Diag}(\mathbf{w})$, then we can write $\nabla\mathbf{w} g_{i}(\mathbf{w}) = \mathbf{u}_i/b_i$. Finally, we can compute the Jacobian $\mathbf{A}$ compactly as $$\mathbf{A} = \left[\textsf{Diag}(\mathbf{1}/\mathbf{b})\mathbf{U}^T, -\mathbf{1}_N\right]$$ which can be compactly coded as
Ut <- diag(Sigma_w) + Sigma*w A <- cbind(Ut/b, -1)
\setlength{\parindent}{-0.2in} \setlength{\leftskip}{0.2in} \setlength{\parskip}{8pt} \noindent
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.