# sparseSIR: sparse SIR In SISIR: Sparse Interval Sliced Inverse Regression

## Description

`sparseSIR` performs the second step of the method (shrinkage of ridge SIR results

## Usage

 ```1 2``` ```sparseSIR(object, inter_len, adaptive = FALSE, sel_prop = 0.05, parallel = FALSE, ncores = NULL) ```

## Arguments

 `object` an object of class `ridgeRes` as obtained from the function `ridgeSIR` `inter_len` (numeric) vector with interval lengths `adaptive` should the function returns the list of strong zeros and non strong zeros (logical). Default to FALSE `sel_prop` used only when `adaptive = TRUE`. Fraction of the coefficients that will be considered as strong zeros and strong non zeros. Default to 0.05 `parallel` whether the computation should be performed in parallel or not. Logical. Default is FALSE `ncores` number of cores to use if `parallel = TRUE`. If left to NULL, all available cores minus one are used

## Value

S3 object of class `sparseRes`: a list consisting of

• `sEDR` the estimated EDR space (a p x d matrix)

• `alpha` the estimated shrinkage coefficients (a vector having a length similar to `inter_len`)

• `quality` a vector with various qualities for the model (see Details)

• `adapt_res` if `adaptive = TRUE`, a list of two vectors:

• `nonzeros` indexes of variables that are strong non zeros

• `zeros` indexes of variables that are strong zeros

• `parameters` a list of hyper-parameters for the method:

• `inter_len` lengths of intervals

• `sel_prop` if `adaptive = TRUE`, fraction of the coefficients which are considered as strong zeros or strong non zeros

• `rSIR` same as the input `object`

• `fit` a list for LASSO fit with:

• `glmnet` result of the `glmnet` function

• `lambda` value of the best Lasso parameter by CV

• `x` exploratory variable values as passed to fit the model

@details Different quality criteria used to select the best models among a list of models with different interval definitions. Quality criteria are: log-likelihood (`loglik`), cross-validation error as provided by the function `glmnet`, two versions of the AIC (`AIC` and `AIC2`) and of the BIC (`BIC` and `BIC2`) in which the number of parameters is either the number of non null intervals or the number of non null parameters with respect to the original variables.

## Author(s)

Victor Picheny, [email protected]

Remi Servien, [email protected]

Nathalie Villa-Vialaneix, [email protected]

## References

Picheny, V., Servien, R. and Villa-Vialaneix, N. (2016) Interpretable sparse SIR for digitized functional data. Preprint.

`ridgeSIR`, `project.sparseRes`, `SISIR`
 ``` 1 2 3 4 5 6 7 8 9 10 11 12``` ```set.seed(1140) tsteps <- seq(0, 1, length = 200) nsim <- 100 simulate_bm <- function() return(c(0, cumsum(rnorm(length(tsteps)-1, sd=1)))) x <- t(replicate(nsim, simulate_bm())) beta <- cbind(sin(tsteps*3*pi/2), sin(tsteps*5*pi/2)) beta[((tsteps < 0.2) || (tsteps > 0.5)), 1] <- 0 beta[((tsteps < 0.6) || (tsteps > 0.75)), 2] <- 0 y <- log(abs(x %*% beta[ ,1]) + 1) + sqrt(abs(x %*% beta[ ,2])) y <- y + rnorm(nsim, sd = 0.1) res_ridge <- ridgeSIR(x, y, H = 10, d = 2, mu2 = 10^8) res_sparse <- sparseSIR(res_ridge, rep(10, 20)) ```