knitr::opts_chunk$set(echo = TRUE)
In a centered reduced rank regression model, multivariate vectors $Y \in R^N$ and $X \in R^M$ satisfy $\text{E}(X)=0$, $\text{E}(Y) = 0$, $\text{E}(Y|X) = BX$. The $N \times M$ matrix $B$ of regression coefficients may not be of full rank. Possible values of the rank of $B$ range from $0$ ($X$ and $Y$ are independent) to $\min(M,N)$ (the full rank model). If the rank is assumed to be $i$ then model singularities occur when the true data-generating model has rank $j < i$.
This vignette uses simulated data, so we create a convenience function simRR()
to simulate n
observations from a reduced rank regression model of rank r
. The parameters N
and M
determine the length of $Y$ and $X$, respectively; n
is the number of observations; and r
is the rank of the coefficient matrix $B$. The output is a list of matrices X
and Y
such that each column represents an individual observation (i.e. X
is transposed with respect to the usual design matrix)
simRR <- function(N, M, n, r) { sing.vals = (r:1) / r + 1 / r B = rortho(N)[,1:r] %*% diag(sing.vals) %*% rortho(M)[1:r,] X = matrix(rnorm(M * n), M, n) Y = B %*% X + matrix(rnorm(N * n), N, n) list(X=X, Y=Y) }
The simRR()
function uses a convenience function rortho()
to generate a random orthonormal matrix.
rortho = function(n) { return(qr.Q(qr(array(runif(n), dim = c(n, n))))) }
Following Drton and Plummer (2017, Section 5.1) we set the dimensions $M$, $N$ for the simulated data and the true rank $r$ as follows:
M = 15 N = 10 r = 5
The learning coefficients and multiplicities for reduced rank regression were derived by Aoyagi and Watanabe (2005). See Drton and Plummer (2017, Section 2.2) for a discussion.
The ReducedRankRegressions()
function sets up a reduced rank regression model of given dimensions (numResponses
, numCovariates
), and with the rank of the coefficient matrix $B$ up to a maximum value (maxRank
). The return value is an object of class "ReducedRankRegressions" containing data on the learning coefficients and multiplicities.
set.seed(1234) library(sBIC) rreg = ReducedRankRegressions(numResponses=N, numCovariates=M, maxRank=min(M, N))
The output from the ReducedRankRegressions()
function is combined with the data in the sBIC()
function to produce the singular BIC as well as the Schwarz BIC. Here we use a simulated data set XY
with 100 observations.
XY = simRR(N, M, 100, r) results = sBIC(XY, rreg) results
The rank selected by choosing the maximum BIC or sBIC is given below. Note that since the smallest model has rank 0, we must subtract 1 from the index to get the rank of the optimal model.
which.max(results$BIC) - 1 which.max(results$sBIC) -1
The asymptotic properties of BIC and sBIC can be further studied by simulating multiple data sets from the reduced rank regression model and comparing the results of rank selection (i.e. choosing the optimal model according to BIC or sBIC).
Here we set up the simulation parameters. We consider sample sizes ranging from 50 up to 1000 and simulate sims=200
data sets for each sample size.
set.seed(1234) n = c(50, 100, 200, 300, 500, 1000) sims = 200
The code below carries out the rank selection and produces a list of frequency tables tt
for the optimal model chosen by BIC and sBIC.
tt <- vector("list", length(n)) for (i in seq_along(n)) { rank.bic = rank.sbic = factor(rep(0, sims), levels = 0:min(M, N)) for (j in 1:sims) { rreg = ReducedRankRegressions(N, M, min(M, N)) XY = simRR(N, M, n[i], r) results = sBIC(XY, rreg) rank.bic[j] = which.max(results$BIC) - 1 rank.sbic[j] = which.max(results$sBIC) - 1 } tt[[i]] = rbind("BIC"=table(rank.bic),"sBIC"=table(rank.sbic)) }
The code below visualizes the frequency tables in tt
as a series of bar plots. The output is shown in Figure 1.
par(mfrow=c(2,3), mar=c(2,1,3,1)+0.1, oma=c(0,2,1,0)) for (i in 1:6) { barplot(tt[[i]], beside=TRUE, ylim=c(0, sims), col=c("white","black"), legend=c("BIC", "sBIC"), main=paste("n=",n[i])) }
Figure 1 partly reproduces Figure 2 of Drton and Plummer (2017). The latter also includes a comparison with the Widely Applicable Bayesian Information Criterion (WBIC) of Watanabe (2013).
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.