In echi/mSIM: Multivariate Response Single Index Model

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

library(mSIM)

Functions

get_B_ADMM

get_B_ADMM is used to fitting the coefficient matrix $B$ in the mSIM algorithm. We use a small dataset within the package to demonstrate the the usage of this function.

Usage

X_train = scale(X_train)
Y_train = scale(Y_train)
result = get_B_ADMM(Y = Y_train, X = X_train, lambda=0.03877, rank=3)

Argument

Y: The index matrix for mSIM in the model fitting process.

X: The covariate matrix for mSIM in the model fitting process.

B: The initial coefficient matrix B. The default value is NULL, and it will be generated by ridge regression. Users can also specify the B matrix.

lambda: The coefficient for regularize term in the optimize process. It should be selected by using General Cross Validation.

rank: The rank of final B matrix.

alpha: The step size in the optimize process.

control1: A list that contains the control parameters for the optimize process. It contains 2 arguments: max.iters and tol. max.iters is the max number of iteration and tol is the threshold of the error. If the number of iteration is bigger than max.iter or the error is less than tol, the optimize process will stop.

control2: A list for regularize function. It contains 3 arguments: ele.sparse, row.sparse and low.rank. If ele.sparse=T, it will apply a element-wise penalty on the B matrix. If row.sparse=T, it will apply a penalty on each row of B matrix. And when low.sparse=T, it will apply a rank penalty on the B matrix. The ele.sparse and row.sparse cannot both be True(T).

select.method: The selection method for link function fitting. The value for this method can be 'linear' and 'nonlinear'. The default value is 'linear'. For more details, please read the paper "Sparse Single Index Models for Multivariate Responses".

descend.method: The optimize algorithm. The package support gradient descend and lbfgs. The value for this argument can be either 'gd', which represent gradient descend, or 'bfgs', which means lbfgs. The default value is 'bfgs'.

plot: A argument that decide whether plot the fitting result. The value should be chosen from T or F. The default value is F.

Value

B.final: The output B matrix after the optimize process.

B.sparse: The sparse representation of matrix B.

B.lowrank: The low rank representation of matrix B.

converge: Whether the optimize process has converged. Value 1 indicts the process is converged.

iteration: Number of iterations during the optimization.

pri.err: The primal error in the ADMM method.

dual.err: The dual error in the ADMM method.

noPenIndex: The index of row that is not penalized.

B_BIC

B_BIC applies Bayesian Information Criteria on the coefficient matrix $B$ to obtain the degree of freedom of $B$.

Usage

# set up tuning parameter
range = c(0.25, 0.75)
rank = c(3, 10)
tuning = list() # all combination of tuning parameters
len.tuning = 0
for(i in rank){
  for(j in range){
    len.tuning = len.tuning + 1
    tuning[[len.tuning]] = c(j, i)
  }
} 

# mSIM model.
temp = list()
for(i in 1:length(tuning)){
    temp[[i]] = get_B_ADMM(Y=Y_train, X=X_train, lambda=tuning[[i]][1],
rank=tuning[[i]][2], alpha=1, control1=list(max.iter=5e1, tol=1e-3), select.method='linear', plot=F)
}
# B for each tuning parameter.
B = list()
for(i in 1:length(tuning)){
    B[[i]] = temp[[i]]$B.sparse
}

BIC.msim = NULL
# BIC for each tuning parameter.
for(i in 1:length(tuning)){
    BIC.msim = rbind(BIC.msim, B_BIC(Y=Y_train, X=X_train, B=B[[i]], tuning=tuning[[i]], linear=F))
}

Argument

Y: The index matrix for mSIM in the model fitting process.

X: The covariate matrix for mSIM in the model fitting process.

B: The coefficient matrix B. The matrix B can be represented by its raw representation, sparse representation and low rank representation.

tuning: All combination of tuning parameters.

linear: The type of link function in the model. The default value is F(False), which means fitting nonlinear functions as link functions. If the value is T(True), it will fit linear functions as link function.

Value

logMSE: The value of log mean square loss.

BIC.nai: The value of Bayesian information criterion.

df.nai: The degree of freedom.

model_pred

model_pred is used to predict the index of a given set of covariates. We also demonstrate the usage through the small dataset within this package.

Usage

X_train = scale(X_train)
Y_train = scale(Y_train)
result = get_B_ADMM(Y = Y_train, X = X_train, lambda=0.03877, rank=3)
test_result = model_pred(Y = Y_train, X = X_train, B = result$B.sparse, Y.true = Y_test, X.pred = X_test)

Argument

Y: The index matrix used for fitting the model.

X: The covariate matrix used for fitting the model.

B: The coefficient matrix.

Y.true: The groundtruth for data that need to be predict.

X.pred: The covariate matrix for prediction.

Value

Y.pred: The prediction of X.pred.

MSE: The mean square loss of the predict process.

Algorithm details

We use $y$ to denote the response matrix and $y_i$ is the response vector. The response vector $y_i$ is linked to the covariated $x_i$ and coefficient matrix $B$ through $q$ unknown functions $F =${$f_1, ..., f_q$}. It is formulated as following: $\mathbf{y}{i}=\left{f{1}\left(\mathbf{x}{i}^{\top} \mathbf{B}{1}\right), \cdots, f_{q}\left(\mathbf{x}{i}^{\top} \mathbf{B}{\cdot q}\right)\right}^{\top}+\boldsymbol{\epsilon}_{i}$

$\mathbf{B}{\cdot j} \in \mathbb{R}^{p}$ denotes the $j$th column of $B$ and is the coefficient vector corresponding to the $j$th response $y{ij}$ in $\mathbf{y}{i}=\left(y{i 1}, \cdots, y_{i q}\right)^{\top}$ for $i = 1, 2, 3, ..., n$.

To fit the mSIM algorithm, we need to solve the following optimization problem. $\underset{\mathcal{F}, \mathbf{B}}{\operatorname{minimize}}\left{\mathcal{L}{\mathcal{F}}(\mathbf{B})+\sum{l=1}^{h} \mathcal{R}{l, \lambda{l}}(\mathbf{B})\right}$ subject to $\left\|\mathbf{B}{\cdot j}\right\|{2}^{2}=1$ and $b_{1 j} \geq 0$ for all $j = 1, 2, ..., n$.

We use least-squares loss in the proposed algorithm, which is $\mathcal{L}{\mathcal{F}}(\mathbf{B})=\frac{1}{2 n} \sum{j=1}^{q}\left\|\mathbf{Y}{\cdot j}-f{j}\left(\mathbf{X B}{\cdot j}\right)\right\|{2}^{2}$.

However, it is computational challenging to simultaneously estimate $q$ unknown linked functions and the index coefficient matrix $B$. So a two-step iteration approach is used for optimization.

First, we estimate each response $f_{j} \in \mathcal{F}$ via univariate smoothing given the index coefficient matrix $B$.
Then we estimate $B$ using the proposed optimization method.

Estimate link function $\mathcal{F}$

We use penalty splines to fit the link functions $\mathcal{F}$. Let $u_{i j}=\mathbf{x}{i}^{\top} \mathbf{B}{\cdot j}$ be the index of $j$th response. The penalized splines approximate $f_j(t)$ by a linear combination of B-splines, i.e. $f_{j}(t)=\sum_{k=1}^{m} \theta_{k j} \phi_{k j}(t)$, where $\theta_{k j} = \left{\phi_{1 j}(t), \ldots, \phi_{m j}(t)\right}$ is a set of $m$ cubic B-spline basis functions constructed from equally-spaced knots defined in the range of $u_{ij}$ and $\theta_{kj}$ are the associated coefficients to be estimated. By minimizing $\mathcal{Q}{\gamma{j}}\left(\Theta_{. j}\right)=\frac{1}{2 n} \sum_{i=1}^{n}\left{y_{i j}-\sum_{k=1}^{m} \theta_{k j} \phi_{k}\left(u_{i j}\right)\right}^{2}+\gamma_{j} \Theta_{. j}^{\top} \mathbf{D} \Theta_{. j}$, where $D$ is a second-order difference penalty matrix and $\gamma_{j}$ is a smoothing parameter, we can estimate coefficient $\Theta_{\cdot j}=\left(\theta_{1 j}, \ldots, \theta_{m j}\right)^{\top}$.

Estimating Coefficient Matrix $B$

Using dummy variable $\mathcal{C}=\left{\mathbf{C}{0}, \mathbf{C}{1}, \cdots, \mathbf{C}_{h}\right}$, the optimization problem is equal to the problem below:

$\operatorname{minimize} \mathcal{L}{\mathcal{F}}(\mathbf{B})+\sum{l=1}^{h} \mathcal{R}{l, \lambda{l}}\left(\mathbf{C}{l}\right)+\sum{j=1}^{q} \iota_{S}\left(\mathbf{C}{0, j}\right)$ subject to $\mathbf{B}=\mathbf{C}{0}=\mathbf{C}{1}=\cdots=\mathbf{C}{h}$

where $\iota S$ is the indicator function of the identifiability constraint set S and is given by

$\iota_{S}(\mathbf{c})=\left{\begin{array}{ll}0 & \text { if }\|\mathbf{c}\|{2}=1 \text { and } c{1} \geq 0 \ +\infty & \text { otherwise }\end{array}\right.$

The augmented Lagrangian function of the problem is

$\mathcal{L}{\rho}(\mathbf{B}, \mathcal{C}, \mathcal{M})=\mathcal{L}{\mathcal{F}}(\mathbf{B})+\sum_{l=1}^{h} \mathcal{R}{l, \lambda{l}}\left(\mathbf{C}{l}\right)+\sum{j=1}^{q} \iota_{S}\left(\mathbf{C}_{0, j}\right)+\mathcal{Q}(\mathbf{B}, \mathcal{C}, \mathcal{M})$

where

$\mathcal{Q}(\mathbf{B}, \mathcal{C}, \mathcal{M})=\sum_{l=0}^{h} \mathcal{Q}{l}\left(\mathbf{B}, \mathbf{C}{l}, \mathbf{M}{l}\right)=\sum{l=0}^{h}\left(\left\langle\mathbf{M}{l}, \mathbf{C}{l}-\mathbf{B}\right\rangle+\frac{\rho}{2}\left\|\mathbf{C}{l}-\mathbf{B}\right\|{\mathbf{F}}^{2}\right)$

$\mathcal{M}=\left{\mathbf{M}{0}, \cdots, \mathbf{M}{h}\right}$ are Lagrange multipier matrix and $\rho$ is the tuning parameter.

The following steps is applied in the minimization of the augmented Lagrangian of an iteration:

Update $1: \mathbf{B}^{(k+1)}=\underset{\mathbf{B}}{\arg \min } \mathcal{L}_{\mathcal{F}}(\mathbf{B})+\mathcal{Q}\left(\mathbf{B}, \mathcal{C}^{(k)}, \mathcal{M}^{(k)}\right)$

Update $2: \mathbf{C}{l}^{(k+1)}=\left{\begin{array}{ll}\arg \min {\mathbf{C}{0}} \mathcal{Q}{0}\left(\mathbf{B}^{(k+1)}, \mathbf{C}{0}, \mathbf{M}{0}^{(k)}\right)+\sum_{j=1}^{q} \iota_{S}\left(\mathbf{C}{0, j}\right) & \text { if } l=0 \ \arg \min {\mathbf{C}{i}} \mathcal{Q}{l}\left(\mathbf{B}^{(k+1)}, \mathbf{C}{l}, \mathbf{M}{l}^{(k)}\right)+\mathcal{R}{l, \lambda{l}}\left(\mathbf{C}_{l}\right) & \text { if } l>0\end{array}\right.$

Update $3: \mathbf{M}{l}^{(k+1)}=\mathbf{M}{l}^{(k)}+\rho\left(\mathbf{C}_{l}^{(k+1)}-\mathbf{B}^{(k+1)}\right)$

Update $1: B$

We use ADMM algorithm to update $B$. There is an iterative optimization algorithm nested in the $(k + 1)$th iteration of the ADMM algorithm, and we set $\mathbf{B}{\cdot j}^{(k+1,1)}=\mathbf{B}{\cdot j}^{(k)}$ as the initial value of $B$ in $(k + 1)$th iteration. $B$'s value in updated using formula $\mathbf{B}{j}^{(k+1, m+1)}=\mathbf{B}{\cdot j}^{(k+1, m)}-t_{m} \nabla g_{j}^{(k)}\left(\mathbf{B}_{-j}^{(k+1, m)}\right)$.

Update $2: C_0$

The solution for update $C_0$ is $\mathbf{C}{0, j}^{(k+1)}=\mathcal{P}{S}\left(\mathbf{B}{\cdot j}^{(k+1)}-\rho^{-1} \mathbf{M}{0, j}^{(k)}\right)$, where $\mathcal{P}_{S}(\cdot)$ is the Euclidean projection of a vector onto the surface of unit-ball with the first element being nonnegative.

Update $3: C_l$

For $C_l$, the subproblem for it can be written as $\mathbf{C}^{(k+1)}=\underset{\mathbf{C}}{\arg \min } \mathcal{R}{\lambda}(\mathbf{C})+\frac{\rho}{2}\left\|\mathbf{C}-\mathbf{B}^{(k+1)}+\rho^{-1} \mathbf{M}^{(k)}\right\|{\mathbf{F}}^{2}$. This is is known as the proximal map and has explicit solution for different regularization penalties. By defining $\mathbf{Z}^{(k)}=\mathbf{B}^{(k+1)}-\rho^{-1} \mathbf{M}^{(k)}$, the solution by updating $C_l$ row by row for $L_{2,1}$ penalty is $\mathbf{C}{r}^{(k+1)}=\left[\begin{array}{l}\left.1-\lambda / \rho\left\|\mathbf{Z}{r \cdot}^{(k)}\right\|{2}\right]{+} \mathbf{Z}{r .}^{(k)}\end{array}\right.$, where $r=1, \cdots, p$ and $[u]{+}=\max {0, u}$ denotes the positive part of the scalar $u$.

Table 1 illustrates different common matrix penalties and their proximal maps.

Table 1. Common matrix penalties and proximal maps | Type | $\mathcal{R}{\lambda}$ | Solution | Comment | |:------------:|-----------------------------------------------------------------|----------|-------------------| | $L{2,1}$ | $\lambda \sum_{r=1}^{p}\left\|\mathbf{c}{r \cdot}\right\|{2}$ | $\mathbf{C}{r .}=\left[1-\lambda / \rho\left\|\mathbf{Z}{r \cdot}\right\|{2}\right]{+} \mathbf{Z}{r}$ | row-wise update | | rank | $\lambda \operatorname{rank}(\mathbf{C})$ | $\mathbf{C}=\mathbf{U H a r d}(\mathbf{D}, \lambda / \rho) \mathbf{V}^{\top}$ | hard-thresholding | | nuclear norm | $\lambda\|\mathbf{C}\|{*}$ | $\mathbf{C}=\mathbf{U S o f t}(\mathbf{D}, \lambda / \rho) \mathbf{V}^{\top}$ | soft-thresholding |

echi/mSIM documentation built on Oct. 6, 2020, 11:09 a.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

echi/mSIM
Multivariate Response Single Index Model

In echi/mSIM: Multivariate Response Single Index Model

Functions

get_B_ADMM

Usage

Argument

Value

B_BIC

Usage

Argument

Value

model_pred

Usage

Argument

Value

Algorithm details

Estimate link function $\mathcal{F}$

Estimating Coefficient Matrix $B$

Update $1: B$

Update $2: C_0$

Update $3: C_l$

R Package Documentation

Browse R Packages

We want your feedback!

echi/mSIM Multivariate Response Single Index Model

In echi/mSIM: Multivariate Response Single Index Model

Functions

get_B_ADMM

Usage

Argument

Value

B_BIC

Usage

Argument

Value

model_pred

Usage

Argument

Value

Algorithm details

Estimate link function $\mathcal{F}$

Estimating Coefficient Matrix $B$

Update $1: B$

Update $2: C_0$

Update $3: C_l$

R Package Documentation

Browse R Packages

We want your feedback!

echi/mSIM
Multivariate Response Single Index Model