View source: R/SIR_threshold_bootstrap.R
SIR_threshold_bootstrap  R Documentation 
Apply a singleindex optimally soft/hard thresholded SIR
with H
slices on
'n_replications' bootstraped replications of (X,Y)
. The optimal number of
selected variables is the number of selected variables that came back most often
among the replications performed. From this, we can get the corresponding \hat{b}
and \lambda_{opt}
that produce the same number of selected variables in the result of
'SIR_threshold_opt'.
SIR_threshold_bootstrap(
Y,
X,
H = 10,
thresholding = "hard",
n_replications = 50,
graph = TRUE,
output = TRUE,
n_lambda = 100,
k = 2,
choice = ""
)
Y 
A numeric vector representing the dependent variable (a response vector). 
X 
A matrix representing the quantitative explanatory variables (bind by column). 
H 
The chosen number of slices (default is 10). 
thresholding 
The thresholding method to choose between hard and soft (default is hard). 
n_replications 
The number of bootstraped replications of (X,Y) done to estimate the model (default is 50). 
graph 
A boolean, set to TRUE to plot graphs (default is TRUE). 
output 
A boolean, set to TRUE to print information (default is TRUE). 
n_lambda 
The number of lambda to test. The n_lambda tested lambdas are uniformally distributed between 0 and the maximum value of the interest matrix (default is 100). 
k 
Multiplication factor of the bootstrapped sample size (default is 1 = keep the same size as original data). 
choice 
the graph to plot:

An object of class SIR_threshold_bootstrap, with attributes:
b 
This is the optimal estimated EDR direction, which is the principal eigenvector of the interest matrix. 
lambda_opt 
The optimal lambda. 
vec_nb_var_selec 
Vector that contains the number of selected variables for each replications. 
occurrences_var 
Vector that contains at index i the number of times the i_th variable has been selected in a replication. 
call 
Unevaluated call to the function. 
nb_var_selec_opt 
Optimal number of selected variables which is the number of selected variables that came back most often among the replications performed. 
list_relevant_variables 
A list that contains the variables selected by the model. 
n 
Sample size. 
p 
The number of variables in X. 
H 
The chosen number of slices. 
n_replications 
The number of bootstraped replications of (X,Y) done to estimate the model. 
thresholding 
The thresholding method used. 
X_reduced 
The X data restricted to the variables selected by the model. It can be used to estimate a new SIR model on the relevant variables to improve the estimation of b. 
mat_b 
Contains the estimation b at each bootstraped replications. 
lambdas_opt_boot 
Contains the optimal lambda found by SIR_threshold_opt at each replication. 
index_pred 
The index Xb' estimated by SIR. 
Y 
The response vector. 
M1 
The interest matrix thresholded with the optimal lambda. 
# Generate Data
set.seed(8)
n < 170
beta < c(1,1,1,1,1,rep(0,15))
X < mvtnorm::rmvnorm(n,sigma=diag(1,20))
eps < rnorm(n,sd=8)
Y < (X%*%beta)**3+eps
# Apply SIR with hard thresholding
SIR_threshold_bootstrap(Y,X,H=10,n_lambda=300,thresholding="hard", n_replications=30,k=2)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.