Description Usage Arguments Details Value See Also Examples
This function estimates the mean squared error (MSE) of parametric,
semi-parametric or non parametric regression models (including possibly
covariates selection) using a repeated learning/test samples approach.
The models are estimated with different methods (chosen by the user) for
comparison purpose. The following methods (with and without variables
selection) are available: multiple linear regression (linreg
),
sliced inverse regression associated with kernel regression (sir
),
random forests regression (rf
), principal components regression (pcr
),
partial least squares regression (plsr
), ridge regression (ridge
).
The procedure for covariates selection is the same for all the
estimation methods and is based on variable importance (VI) obtained via
repeated random perturbations of the covariates.
1 2 |
X |
a numerical matrix containing the |
Y |
a numerical response vector. |
method |
a vector with the names of the chosen regression methods
( |
N |
the number of replications (the number of ramdom leaning/test samples) to estimate the MSE values. |
prop_train |
a value between 0 and 1 with the proportion of observations in the training samples. |
nperm |
the number of random permutations to perform the importance of the covariates (VI). |
cutoff |
if TRUE the covariates are selected automatically and the number
of selected variables is unknown. If |
nbsel |
the number of selected covariates. Active only if
|
The only method with no parameter to tune is "linreg"
.
The parameters of the methods sir
, pcr
, plsr
and ridge
are tuned on the training samples. The bandwidth for Kernel Regression Smoother is
tuned by leave one out cross validation. The number of components for pcr
and plsr
is
tuned as follows: for each possible number of components, the root mean square error (RMSE) is calculated
via 5-fold cross validation and the number of components is selected by detecting
a change point position (in mean and variance). The parameter mtry
for random forests
regression is not tuned and is fixed to p/3. The number of trees is not tuned and is fixed to ntree=300
.
An object with S3 class "choicemod" and the following components:
mse |
a matrix of dimension |
mse_all |
a matrix of dimension |
sizemod |
a matrix of dimension |
pvarsel |
a matrix of dimension |
boxplot.choicemod
, barplot.choicemod
,
varimportance
1 2 3 4 5 6 7 |
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.