Dimension Reduction Techniques for Conditional Quantiles in R: A Vignette
In quantdr: Dimension Reduction Techniques for Conditional Quantiles

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

Introduction

Data visualization and non-parametric fitting can become very challenging when data is high-dimensional. This has led to the development of many dimension reduction techniques that focus on the entire conditional distribution of the data. However, when specific aspects of the conditional distribution are of interest, these methods can provide more directions than necessary.

Quantile regression (QR) has received a lot of attention since its inception from Koenker and Bassett (1978) and it has been an attractive tool for research areas where non-central parts of the data are important. Dimension reduction techniques for conditional quantiles have recently made an appearance and are still in development. For some references see Wu et al. (2010), Kong and Xia (2012, 2014), Luo et al. (2014), and Christou and Akritas (2016).

More recently, Christou (2020) proposed a new technique for finding the fewest linear combinations of the predictor variable $\mathbf{X}$ that contain all the information about the conditional quantile. To give further intuition, for a univariate response $Y$ and a $p$-dimensional predictor $\mathbf{X}$, the method focuses on finding a $p \times d_{\tau}$ matrix $\mathbf{B}{\tau}$, for $\tau \in (0, 1)$ and $d{\tau} \leq p$, such that $Y$ and $Q_{\tau}(Y|\mathbf{X})$ are conditionally independent given $\mathbf{B}{\tau}^{\top} \mathbf{X}$, where $Q{\tau}(Y|\mathbf{X})$ denotes the $\tau$th conditional quantile of $Y$ given $\mathbf{X}$. The space spanned by $\mathbf{B}{\tau}$ is called the $\tau$th quantile dimension reduction subspace for the regression of $Y$ on $\mathbf{X}$. The intersection of all $\tau$th dimension reduction subspaces is called the $\tau$th central quantile subspace ($\tau$-CQS) and is denoted by $\mathcal{S}{Q_{\tau}(Y|\mathbf{X})}$.

The purpose of this vignette is to demonstrate the implementation of the functions appearing in the quantdr package, along with some examples. The main function of the package is cqs which returns the estimated directions of the $\tau$-CQS.

Getting Started

To get started you need to first install the package using the command

install.packages("quantdr")

and then use it during any R session by issuing the command

library(quantdr)

For an overview of the available help files of the package use

help(package = "quantdr")

The package includes the following functions:

llqr Local linear quantile regression
llqrcv Cross-Validation for bandwidth selection of local linear quantile regression
cqs Central quantile subspace
ValAR Value-at-Risk estimation using the central quantile subspace

To get help on a specific function type help(function.name) or ?function.name for a convenient shorthand. Try

help(cqs)
?cqs

For further examples, refer to the coding examples listed in the documentation for each quantdr function.

Central quantile subspace

Overview

The overall goal of the cqs function is to identify the coefficients of the linear combination $\mathbf{B}{\tau}^{\top}\mathbf{X}$ in order to replace the $p \times 1$ predictor vector $\mathbf{X}$ with the low-dimensional $d{\tau} \times 1$ predictor vector $\mathbf{B}_{\tau}^{\top}\mathbf{X}$. To do that, Christou (2020) proved two important results:

$\boldsymbol{\beta}{\tau} \in \mathcal{S}{Q_{\tau}(Y|\mathbf{X})}$, where $\boldsymbol{\beta}{\tau}$ is the slope vector from regressing the conditional quantile on $\mathbf{X}$, i.e., \begin{eqnarray} (\alpha_{\tau}, \boldsymbol{\beta}{\tau}) = \arg \min{(a_{\tau}, \mathbf{b}{\tau})} E { Q{\tau}(Y| \mathbf{A}^{\top} \mathbf{X}) - a_{\tau} - \mathbf{b}_{\tau}^{\top}\mathbf{X}}^2, \end{eqnarray} and $\mathbf{A}$ spans the central subspace (Li 1991). This implies an initial dimension reduction using $\mathbf{A}$, making the nonparametric estimation of $Q{\tau}(Y|\mathbf{A}^{\top}\mathbf{X})$ tractable.
$E{Q_{\tau} (Y|U_{\tau})\mathbf{X}} \in \mathcal{S}{Q{\tau}(Y|\mathbf{X})}$, where $U_{\tau}$ is a measurable function of $\mathbf{B}{\tau}^{\top}\mathbf{X}$, provided that $Q{\tau}(Y|U_{\tau})\mathbf{X}$ is integrable.

The above two results imply the following:

If the dimension of the $\tau$-CQS is known to be one, then we can fit a linear regression model of $Q_{\tau}(Y|\mathbf{A}^{\top}\mathbf{X})$ on $\mathbf{X}$ and report the slope vector as the basis vector.
If the dimension of the $\tau$-CQS is known to be greater than one, then we can set $\boldsymbol{\beta}{\tau, 0}=\boldsymbol{\beta}{\tau}$ as the initial vector and then create more vectors using $\boldsymbol{\beta}{\tau,j}=E{Q{\tau}(Y | \boldsymbol{\beta}{\tau, j-1}^{\top}\mathbf{X})\mathbf{X}}$, for $j=1, \dots, p-1$. In order to obtain linearly independent vectors, we can perform an eigenvalue decomposition on $\mathbf{V}{\tau}\mathbf{V}{\tau}^{\top}$, where $\mathbf{V}{\tau}=(\boldsymbol{\beta}{\tau, 0}, \dots, \boldsymbol{\beta}{\tau, p-1})$ and choose the eigenvectors corresponding to the $d_{\tau}$ nonzero eigenvalues.
If the dimension of the $\tau$-CQS is unknown, we can apply the procedure from 2. Then, we can estimate the dimension $d_{\tau}$ using the eigenvalues resulting from the eigenvalue decomposition. Existing techniques such as the Cross-Validation (CV) criterion or the modified-BIC type criterion of Zhu et al. (2010) can be used to determine the structural dimension.

All of the above are incorporated in the cqs function.

The algorithm at a glance

The first step of the algorithm requires to fit a linear regression model of $Q_{\tau}(Y|\mathbf{A}^{\top}\mathbf{X})$ on $\mathbf{X}$, which implies the non-parametric estimation of the conditional quantile. The second step of the algorithm relies on estimating an expectation and performing an eigenvalue decomposition. Specifically,

We use a dimension reduction technique to estimate the basis matrix $\mathbf{A}$ of the central subspace, denoted by $\widehat{\mathbf{A}}$, and form the new sufficient predictors $\widehat{\mathbf{A}}^{\top} \mathbf{X}_{i}$, $i=1, \dots, n$. For this we use the function dr from the package dr.
For each $i=1, \dots, n$, we use the local linear conditional quantile estimation method of Guerre and Sabbah (2012) to estimate $Q_{\tau}(Y|\widehat{\mathbf{A}}^{\top} \mathbf{X}_{i})$. For this we use the llqr of the presented package.
We set $\widehat{\boldsymbol{\beta}}_{\tau}$ to be \begin{eqnarray} (\widehat{\alpha}{\tau}, \widehat{\boldsymbol{\beta}}{\tau}) = \arg \min_{(a_{\tau}, \mathbf{b}{\tau})} \sum{i=1}^{n} {\widehat{Q}{\tau}(Y|\widehat{\mathbf{A}}^{\top}\mathbf{X}{i}) - a_{\tau} - \mathbf{b}{\tau}^{\top} \mathbf{X}{i}}^2. \end{eqnarray} For this we use the lm function of the stats package.
If $d_{\tau}=1$ we stop and report $\widehat{\boldsymbol{\beta}}{\tau}$ as the estimated basis vector for $\mathcal{S}{Q_{\tau}(Y|\mathbf{X})}$. Otherwise, we move to Step 5.
We set $\widehat{\boldsymbol{\beta}}{\tau, 0}=\widehat{\boldsymbol{\beta}}{\tau}$, and, for $j=1, \dots, p-1$, we A. form the predictors $\widehat{\boldsymbol{\beta}}{\tau, j-1}^{\top} \mathbf{X}{i}$, $i=1, \dots, n$ and estimate $Q_{\tau}(Y| \widehat{\boldsymbol{\beta}}{\tau, j-1}^{\top} \mathbf{X}{i})$ using the local linear conditional quantile estimation method. For this we use the llqr function of the presented paper. A. form $\widehat{\boldsymbol{\beta}}{\tau, j} = n^{-1} \sum{i=1}^{n} \widehat{Q}{\tau}(Y|\widehat{\boldsymbol{\beta}}{\tau, j-1}^{\top} \mathbf{X}{i}) \mathbf{X}{i}$.
We form the $p \times p$ matrix $\widehat{\mathbf{V}}{\tau}=(\widehat{\boldsymbol{\beta}}{\tau, 0}, \dots, \widehat{\boldsymbol{\beta}}{\tau, p-1})$ and choose the $d{\tau}$ eigenvectors corresponding to the $d_{\tau}$ largest eigenvalues of $\widehat{\mathbf{V}}{\tau} \widehat{\mathbf{V}}{\tau}^{\top}$. For this we use the eigen function of the base package.

If $d_{\tau}$ is unknown, a suggested structural dimension can be estimated using existing techniques such as the CV criterion or the modified-BIC type criterion of Zhu et al. (2010). In terms of the cqs function, the modified-BIC type criterion is utilized and returns a suggested dimension. Note that the user can extract the eigenvalues and apply them to any other existing technique.

Output of `cqs`

For the purpose of this section, we simulate data from the homoscedastic single-index model \begin{eqnarray} Y=3X_{1}+X_{2}+\epsilon, \end{eqnarray} where $\mathbf{X}=(X_{1}, \dots, X_{10})^{\top}$ and the error $\epsilon$ are generated according to a standard normal distribution. The $\tau$-CQS is spanned by $(3, 1, 0, \dots, 0)^{\top}$ for all $\tau$. We focus on the estimation of the 0.5-CQS.

set.seed(1234)
n <- 100
p <- 10
tau <- 0.5
x <- matrix(rnorm(n * p), n, p)
error <- rnorm(n)
y <- 3 * x[, 1] + x[, 2] + error

The cqs function proceeds by specifying an $n \times p$ design matrix $\mathbf{X}$, a vector of the response variable $Y$, and a quantile level $\tau$. The dimension of the $\tau$-CQS, i.e., $d_{\tau}$, is optional. Note that for this example $d_{\tau}=1$ for every quantile level $\tau$.

We now discuss the output of the cqs function.

out1 <- cqs(x, y, tau = tau, dtau = 1)
out1

When the dimension of the $\tau$-CQS is known to be one, the algorithm reports the ordinary least-squares slope vector as the basis vector of the subspace. This is denoted by qvectors. Moreover, the function cqs outputs dtau, which in this case is specified by the user.

When the dimension of the $\tau$-CQS is unknown (or known to be greater than one), the algorithm continues, creates more vectors, and performs an eigenvalue decomposition. Therefore, the output includes one more element, the eigenvalues provided by the eigenvalue decomposition and denoted by qvalues.

out2 <- cqs(x, y, tau = tau)
out2

Note that the dimension dtau is correctly estimated by the modified-BIC type criterion. Therefore, this output suggests to take the first eigenvector as the basis vector for the $\tau$-CQS. We can extract the direction using

out2$qvectors[, 1:out2$dtau]

If we want to measure the estimation error we can use the angle between the true and the estimated subspaces. This is performed using the subspace function of the pracma package. Note that the angle is measured in radians, and so we divide by $\pi / 2$.

library(pracma)
beta_true <- c(3, 1, rep(0, p - 2))
beta_hat1 <- out1$qvectors
beta_hat2 <- out2$qvectors[, 1:out2$dtau]
subspace(beta_true, beta_hat1) / (pi / 2)
subspace(beta_true, beta_hat2) / (pi / 2)

Note that the estimation error is slightly higher when the dimension of the subspace is unknown and needs to be estimated.

We can continue further and estimate the conditional quantile function. To do this, we first form the new sufficient predictor $\widehat{\mathbf{B}}_{\tau}^{\top}\mathbf{X}$ using

newx <- x %*% beta_hat1

We then estimate the conditional quantile using the llqr function of the presented package. The main arguments of the function are: the design matrix, which in this case is $\widehat{\mathbf{B}}_{\tau}^{\top}\mathbf{X}$, the vector of the response variable $Y$, and the quantile level $\tau$. The rest of the arguments, i.e., the bandwidth $h$, the method for the estimation of $h$, and the single observation $x_0$ for which to perform the estimation, are optional. If $h$ is not specified, then it will be determined using either the rule-of-thumb bandwidth of Yu and Jones (1998) or the CV criterion; the default choice is the rule-of-thumb. However, the user needs to be careful about the bandwidth selection. When the dimension of the predictor variable is large compared to the sample size, local linear fitting meets the 'curse of dimensionality' problem. In situations like that, the bandwidth selected by the rule-of-thumb or the CV criterion might be small and cause the function to fail. For these cases, we advice the user to pre-specify the bandwidth in the function. Finally, if $x_0$ is not specified, the estimation will be performed on the entire design matrix. For this example we use

qhat1 <- llqr(newx, y, tau)
qhat1

The output consists of the estimation of the conditional quantile function at each point of the design matrix, i.e., $\widehat{Q}{\tau}(Y|\widehat{\mathbf{B}}{\tau}^{\top} \mathbf{X}_{i})$, $i=1, \dots, n$, and the estimation of the bandwidth using the rule-of-thumb. For illustration purposes, we repeat the above using the CV criterion for the bandwidth selection.

qhat2 <- llqr(newx, y, tau, method = "CV")
qhat2

If $h$ is pre-specified, then the output reports the value given by the user, i.e.,

qhat3 <- llqr(newx, y, tau, h = 1)
qhat3

To illustrate the median regression, compared with the original data, we can use ``` {r fig1, fig.height = 4.5, fig.width = 4.5, fig.align = "center"} true_dir <- x %*% beta_true

plot(true_dir, y, xlab = "sufficient direction", ylab = "y", pch = 16) points(true_dir, qhat1$ll_est, pch = 16, col = 'red')

The same procedure can be performed for multiple quantile levels.  For example, we estimate the $\tau$-CQS for $\tau= 0.1, 0.25, 0.5, 0.75, 0.9$.  
``` {r}
taus <- c(0.1, 0.25, 0.5, 0.75, 0.9)

out3 <- matrix(0, p, length(taus))
for (i in 1:length(taus)) {
  out3[, i] <- cqs(x, y, tau = taus[i], dtau = 1)$qvectors
}
out3

Similarly, we can plot the data along with the estimated conditional quantile functions for the various quantile levels.
``` {r fig2, fig.height = 5.5, fig.width = 7, fig.align = "center"} newx <- x %*% out3

oldpar <- par(no.readonly = TRUE) par(mfrow=c(2,3)) qhat_tau <- as.null() for (i in 1:length(taus)) { plot(true_dir, y, xlab = "sufficient direction", ylab = "y", main = taus[i], pch = 16) qhat_tau <- llqr(newx[, i], y, tau = taus[i])$ll_est points(true_dir, qhat_tau, pch = 16, col = "red") } par(oldpar)

<!--  ## A Real Data Application -->

# Value-at-Risk Estimation
## Overview

Value-at-risk (VaR) is an important financial quantity directly related to the QR framework.  Specifically, for $\{r_{t}\}_{t=1}^{n}$ a time series of returns, the $\tau$th VaR at time $t$, denoted by $VaR_{\tau}(t)$, is the smallest number for which $P\{r_{t} < -VaR_{\tau}(t) | \mathcal{F}_{t-1}\} = \tau$, where $\mathcal{F}_{t-1}$ denotes the information set at time $t-1$.  Therefore, $-VaR_{\tau}(t)$ is the $\tau$th conditional quantile of $r_{t}$, and we can write $Q_{\tau}(r_{t} | \mathcal{F}_{t-1}) = -VaR_{\tau}(t)$.   

Christou and Grabchak (2019) used the methodology proposed by Christou (2020) to estimate VaR and compare it with existing commonly-used methods.  The estimation of $VaR_{\tau}(t)$ at time $t$ reduces to the estimation of $Q_{\tau}(r_{t} | \mathcal{F}_{t-1})$, which can be performed using the `cqs` and `llqr` functions of the presented package.  Specifically, let $r_{t}$ denote a daily return (usually log-return) and $\mathbf{r}_{t}$ denote a $p$-dimensional vector of covariates consisting of the $p$ past returns, i.e., $\mathbf{r}_{t,p} = (r_{t-1},\dots, r_{t-p})^{\top}$.  We want to estimate $Q_{\tau}(r_{t} | \mathbf{r}_{t,p})$ at time $t$ by assuming that there exists a $p \times d_{\tau}$ matrix $\mathbf{B}_{\tau}$, $d_{\tau} \leq p$, such that $Q_{\tau}(r_{t} | \mathbf{r}_{t,p}) = Q_{\tau}(r_{t} | \mathbf{B}_{\tau}^{\top}\mathbf{r}_{t,p})$.  This implies that $\mathbf{B}_{\tau}^{\top}\mathbf{r}_{t,p}$ contains all the information about $r_{t}$ that is available from $Q_{\tau}(r_{t} | \mathbf{r}_{t,p})$.  Once $\mathbf{B}_{\tau}$ is estimated using the `cqs` function, the sufficient predictors $\widehat{\mathbf{B}}_{\tau}^{\top}\mathbf{r}_{t,p}$ are formed and $Q_{\tau}(r_{t} | \widehat{\mathbf{B}}_{\tau}^{\top}\mathbf{r}_{t,p})$ is estimated using the `llqr` function.    

## Output of `ValAR`

For an illustration of VaR estimation, we use the `edhec` data set available in the `PerformanceAnalytics` package in R.  The data set includes the EDHEC composite hedge fund style index returns.  
```r
library(PerformanceAnalytics)
data(edhec, package = "PerformanceAnalytics")
head(edhec)

To estimate the one-step ahead VaR we need a vector of returns in a standard chronological order. For this, we will select the 'CTA Global Distressed' as a vector of returns. The ValAR function proceeds by specifying the vector of returns $y$, the number $p$ of past observations to be used as the predictor variables, the quantile level $\tau$, a moving window for the number of observations to be used for fitting the model, and a logical operator to indicate whether the returns are given in a standard chronological order (oldest to newest) or not. If a moving window is not specified, then the default value of min(250, n - p) will be used to estimate the conditional quantile. Note that, typical values for a moving window correspond to one or two years of returns. A moving window will be useful if the number of observations is large and old returns will not be of much value to the prediction. Moreover, if the number of observations is large, a moving window will also speed up the algorithm, since the number of observations to be used will be smaller. Finally, if the returns are in reversed chronological order, then the user should either reverse the vector of returns or indicate that using the chronological option of the function.

For this example, we will use $p = 5$, $\tau = 0.05$, and will not specify any moving window so that the default value of min(250, 275 - 5) = 250 will be used. Also, the returns are given in standard chronological order and, since this is the default setting, we do not specify anything for chronological. Note that, the last day of returns is 2019-11-30 and therefore, we are predicting the 5\% VaR for 2019-12-31.

y <- as.vector(edhec[, 2])
n <- length(y)
p <- 5
tau <- 0.05
ValAR(y, p = p, tau = tau)

If we want to compare this number with the 5\% VaR estimated by the historical method, we can use

VaR(y, 0.95, method = 'historical')

For illustration purposes, we will use the last 100 observations as a testing data set for one-step ahead VaR estimation and the rest of the observations as historical data. Specifically, for each of the time points, $t$, of the observations not included in the historical data, we estimate the model based on all of the data up to time $t-1$ and report the 5\% VaR.

size <- 100

VaRest <- as.null(size)
for (i in 1:size){
  VaRest[i] <- ValAR(y[1:(n - size + i - 1)], p, tau)
}

To visualize the results, we can plot the time series of returns with the one-step ahead 5\% VaR forecasts.

plot.ts(y[(n - size + 1):n], ylim = range(y[(n - size + 1):n], VaRest), ylab = 'returns')
lines(VaRest, col = 'red')

img <- png::readPNG("returns1.png")
grid::grid.raster(img)

Moreover, we can calculate the proportion of exceptions, i.e., the number of times $r_{t} < -\widehat{VaR}_{\tau}(t)$ for each time point $t$ of the last 100 observations. This number should be a reasonable estimate for $\tau$. In this case we can see that the estimated proportion is actually 0.05.

sum(y[(n - size + 1):n] < VaRest) / size 
#> [1] 0.05

We can repeat the above procedure for multiple quantile levels, i.e., for $\tau= 0.01, 0.025, 0.05$.

taus <- c(0.01, 0.025, 0.05)
VaRest <- matrix(0, size, length(taus))
for (i in 1:size) {
  for (j in 1:length(taus)) {
  VaRest[i, j] <- ValAR(y[1:(n - size + i - 1)], p, taus[j])
  }
}  

# plots
plot.ts(y[(n - size + 1):n], ylim = range(y[(n - size + 1):n], VaRest), ylab = 'returns')
lines(VaRest[, 1], col = 'red')
lines(VaRest[, 2], col = 'blue')
lines(VaRest[, 3], col = 'green')
legend('top', legend = c("1%", "2.5%", "5%"), col = c("red", "blue", "green"),
       lty=1, cex=1, horiz = T, bty = "n")

img <- png::readPNG("returns2.png")
grid::grid.raster(img)

# proportion of exceptions
sum(y[(n - size + 1):n] < VaRest[, 1]) / size 
#> [1] 0.03
sum(y[(n - size + 1):n] < VaRest[, 2]) / size 
#> [1] 0.03
sum(y[(n - size + 1):n] < VaRest[, 3]) / size 
#> [1] 0.05

We can see that the estimated proportion of exceptions are very close to the true values of 0.01, 0.025, and 0.05.

Conclusion

This vignette provides a brief introduction to the R package quantdr and presents a tutorial on how to implement the basic function cqs. Performing dimension reduction techniques to conditional quantiles is an active research topic and therefore updates and new functions will be incorporated into forthcoming versions of the package. For this reason, this document will be updated accordingly and will be made available through the package. Suggestions and recommendations from the users will be highly appreciated and welcomed. This package is also available through GitHub. Feel free to contact the author at echris15@uncc.edu.

References

Christou, E. (2020) Central quantile subspace. Statistics and Computing, 30, 677–695
Christou, E. and Akritas, M. (2016) Single index quantile regression for heteroscedastic data. Journal of Multivariate Analysis, 150, 169-182
Christou, E. and Grabchak, M. (2019) Estimation of value-at-risk using single index quantile regression. Journal of Applied Statistics, 46(13), 2418–2433
Guerre, E., and Sabbah, C. (2012) Uniform bias study and Bahadur representation for local polynomial estimators of the conditional quantile function. Econometric Theory, 28, 87-129
Koenker, R., and Bassett, G. (1978) Regression quantiles. Econometrica, 46, 33-50
Kong, E., and Xia, Y. (2012) A single-index quantile regression model and its estimation. Econometric Theory, 28, 730-768
Kong, E., and Xia, Y. (2014) An adaptive composite quantile approach to dimension reduction. Annals of Statistics, 42, 1657-1688
Li, K.-C. (1991) Sliced inverse regression for dimension reduction. Journal of the American Statistical Association, 86, 316-327
Luo, W., Li, B., and Yin, X. (2014) On efficient dimension reduction with respect to a statistical functional of interest. Annals of Statistics, 42, 382-412
Wu, T. Z., Yu, K., and Yu, Y. (2010) Single index quantile regression. Journal of Multivariate Analysis, 101, 1607-1621
Yu, K., and Jones, M. C. (1998) Local linear quantile regression. Journal of the American Statistical Association, 93, 228-238
Zhu, L.-P., Zhu, L.-X., and Feng, Z.-H. (2010) Dimension reduction in regression through cumulative slicing estimation. Journal of the American Statistical Association, 105, 1455-1466