In spcorum/tsmvr-saved: Truly Sparse Multivariate Regression

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

Truly Sparse Multivariate Regression

tsmvr or Truly Sparse Multivariate Regression is an R package for solving sparse multivariate regression problems with error covariance estimation. The workhorse algorithm in tsmvr is adapted from the algorithm described by J. Chen and Q. Gu in their 2016 paper High Dimensional Multivariate Regression and Precision Matrix Estimation via Nonconvex Optimization .

A multivariate regression problem is a regression problem with multiple responses. Formally,

$ Y = X B + E $

Here, X is the design matrix of $n$ observations of $p$ features, Y is the design matrix of $n$ observations of $q$ responses, $B$ is the regression matrix, and $E$ is the error term. Given $X$ and $Y$, tsmvr solves this problem for $B$ under the constraint that $B$ is sparse and the condition that the errors may be correlated. Under the hood, the error correlations are encoded in the precision matrix \mathbf{\Omega}, which has its own sparsity constraint.

A First Example

For a first example, define some problem parameters.

n = 1000                           # number of observations
p = 100                            # number of predictors
q = 10                             # number of responses
sparsity = 0.1                     # sparsity of true regression matrix
s1 = round(p * q * sparsity * 1.1) # fitted sparsity will be a little larger than the true sparsity
s2 = 3 * q - 4                     # constrains precision matrix to have the number of entries as a tri-diagonal matrix

The following code generates a synthetic dataset. Here, the dataset has a true regression matrix of r sparsity.

set.seed(1729)
data = tsmvr::make_data(
  n = n, p = p, q = q, b1 = sqrt(sparsity), b2 = sqrt(sparsity)
)[[1]]

Important: the data in design matrix generated in the code chunk above have mean zero (and standard deviation one). tsmvr assumes all data have zero mean, so it is important to do zeroing transformations to the data before running algorithm.

The function \code{tsmvr_solve} solves the regression problem using hard-thresholded block-wise alternating gradient descent with fixed learning rates.

# library(tsmvr)
gd_gd_solution = tsmvr_solve(
  X = data$X, Y = data$Y, s1 = s1, s2 = s2, 
  B_type = 'gd', Omega_type = 'gd',
  eta1 = 0.05, eta2 = 0.1,
  skip = 50, max_iter = 10
)

Sometimes the fixed learning rate method is cumbersome because the best learning rates need to be found by trial and error. In that case, tsmvr uses a generalized linesearch procedure to find learning rates for the user. In this case, the paramaters \code{eta1} and \code{eta2} becomes the initial learning rates in the linesearch procedure. The cost for not having to choose the learning rates is that the algorithm runs slower.

library(tsmvr)
ls_ls_solution = tsmvr_solve(
  X = data$X, Y = data$Y, s1 = s1, s2 = s2, 
  B_type = 'ls', Omega_type = 'ls',
  eta1 = 0.05, eta2 = 0.1,
  skip = 50
)

k-fold Cross Validation

k-fold cross validation may be performed using the function \code{tsmvr_cv}

set.seed(1)
validated = tsmvr::tsmvr_cv(
  X = data$X, Y = data$Y, s1 = s1, s2 = s2, k = 3,
  B_type ='ls', Omega_type = 'ls', 
  eta1 = 0.05, eta2 = 0.2
)

Replicated k-fold Cross Validation

Similarly, replicated k-fold cross validation may be performed using the function \code{tsmvr_replicate}. Be warned, the code chunk below will take time to run.

set.seed(3)
replicated = tsmvr::tsmvr_replicate(
  X = data$X, Y = data$Y, s1 = s1, s2 = s2, k = 2, rep = 2,
  B_type ='ls', Omega_type = 'ls', 
  eta1 = 0.05, eta2 = 0.1
)

Gridsearch

Finally, replicated k-fold cross validation may be used to search a space of \code{s1} and \code{s2} values to find the pair of values that minimizes the cross validation error. The code chunk below takes time to run.

s1_grid = c(80,100,120,140)
s2_grid = c(25,26,31,35)
set.seed(5)
grid = tsmvr_gridsearch(
  X = data$X, Y = data$Y, s1_grid = s1_grid, s2_grid = s2_grid, 
  k = 2, reps = 3, B_type = 'ls', Omega_type = 'ls',
  eta1 = 0.1, eta2 = 0.2, quiet = F
)