# SIR_threshold_opt: SIR optimally thresholded In SIRthresholded: Sliced Inverse Regression with Thresholding

 SIR_threshold_opt R Documentation

## SIR optimally thresholded

### Description

Apply a single-index SIR on (X,Y) with H slices, with a soft/hard thresholding of the interest matrix \widehat{\Sigma}_n^{-1}\widehat{\Gamma}_n by an optimal parameter \lambda_{opt}. The \lambda_{opt} is found automatically among a vector of n_lambda \lambda, starting from 0 to the maximum value of \widehat{\Sigma}_n^{-1}\widehat{\Gamma}_n. For each feature of X, the number of \lambda associated with a selection of this feature is stored (in a vector of size p). This vector is sorted in a decreasing way. Then, thanks to strucchange::breakpoints, a breakpoint is found in this sorted vector. The coefficients of the variables at the left of the breakpoint, tend to be automatically toggled to 0 due to the thresholding operation based on \lambda_{opt}, and so should be removed (useless variables). Finally, \lambda_{opt} corresponds to the first \lambda such that the associated \hat{b} provides the same number of zeros as the breakpoint's value.

For example, for X \in R^{10} and n_lambda=100, this sorted vector can look like this :

 X10 X3 X8 X5 X7 X9 X4 X6 X2 X1 2 3 3 4 4 4 6 10 95 100

Here, the breakpoint would be 8.

### Usage

SIR_threshold_opt(
Y,
X,
H = 10,
n_lambda = 100,
thresholding = "hard",
graph = TRUE,
output = TRUE,
choice = ""
)


### Arguments

 Y A numeric vector representing the dependent variable (a response vector). X A matrix representing the quantitative explanatory variables (bind by column). H The chosen number of slices (default is 10). n_lambda The number of lambda to test. The n_lambda tested lambdas are uniformally distributed between 0 and the maximum value of the interest matrix. (default is 100). thresholding The thresholding method to choose between hard and soft (default is hard). graph A boolean, set to TRUE to plot graphs (default is TRUE). output A boolean, set to TRUE to print informations (default is TRUE). choice the graph to plot: "estim_ind" Plot the estimated index by the SIR model versus Y. "opt_lambda" Plot the choice of the optimal lambda. "cos2_selec" Plot the evolution of cos^2 and variable selection according to lambda. "regul_path" Plot the regularization path of b. "" Plot every graphs (default).

### Value

An object of class SIR_threshold_opt, with attributes:

 b This is the optimal estimated EDR direction, which is the principal eigenvector of the interest matrix. lambdas A vector that contains the tested lambdas. lambda_opt The optimal lambda. mat_b A matrix of size p*n_lambda that contains an estimation of beta in the columns for each lambda. n_lambda The number of lambda tested. vect_nb_zeros The number of 0 in b for each lambda. list_relevant_variables A list that contains the variables selected by the model. fit_bp An object of class breakpoints from the strucchange package, that contains informations about the breakpoint which allows to deduce the optimal lambda. indices_useless_var A vector that contains p items: each variable is associated with the number of lambda that selects this variable. vect_cos_squared A vector that contains for each lambda, the cosine squared between vanilla SIR and SIR thresholded. Y The response vector. n Sample size. p The number of variables in X. H The chosen number of slices. M1 The interest matrix thresholded with the optimal lambda. thresholding The thresholding method used. call Unevaluated call to the function. X_reduced The X data restricted to the variables selected by the model. It can be used to estimate a new SIR model on the relevant variables to improve the estimation of b. index_pred The index Xb' estimated by SIR.

### Examples

# Generate Data
set.seed(2)
n <- 200
beta <- c(1,1,rep(0,8))
X <- mvtnorm::rmvnorm(n,sigma=diag(1,10))
eps <- rnorm(n)
Y <- (X%*%beta)**3+eps

# Apply SIR with soft thresholding
SIR_threshold_opt(Y,X,H=10,n_lambda=300,thresholding="soft")


SIRthresholded documentation built on July 10, 2023, 2:03 a.m.