Description Usage Arguments Details Value See Also
Implement blockwise coordinate descent algorithm or CoCoLasso algorithm
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 | generalcoco(
Z,
y,
n,
p,
p1 = NULL,
p2 = NULL,
p3 = NULL,
center.Z = TRUE,
scale.Z = TRUE,
center.y = TRUE,
scale.y = TRUE,
lambda.factor = ifelse(dim(Z)[1] < dim(Z)[2], 0.01, 0.001),
step = 100,
K = 4,
mu = 10,
tau = NULL,
etol = 1e-04,
optTol = 1e-05,
earlyStopping_max = 10,
penalty = c("lasso", "SCAD"),
mode = "ADMM"
)
|
Z |
Corrupted design matrix (with additive error or missing data) |
y |
Response vector |
n |
Number of samples of the design matrix |
p |
Number of features of the matrix |
p1 |
Number of uncorrupted predictors (if dealing with block descent) |
p2 |
Number of corrupted predictors containing additive error (if dealing with block descent) |
p3 |
Number of corrupted predictors containing missingness (if dealing with block descent) |
center.Z |
If TRUE, centers Z matrix without taking into account NAs values, and then change NAs to 0 value (in the missing data setting). |
scale.Z |
If TRUE, divides Z columns by their standard deviation |
center.y |
If TRUE, centers y |
scale.y |
If TRUE, divides y by its standard deviation |
lambda.factor |
Range of the lambda interval we are going to explore |
step |
Number of values of lambda in the interval we are going to test |
K |
Number of folds for the cross-validation |
mu |
Penalty parameter for the ADMM algorithm |
tau |
Standard deviation for the additive error matrix in the additive error setting (NULL in the missing data setting) |
etol |
Tolerance parameter for the ADMM algorithm. This parameter has an impact on computing speed, since it controls the number of iterations of the ADMM algorithm which can be quite slow when the number of features increases. |
optTol |
Tolerance parameter for the convergence of the error in the pathwise coordinate descent. This parameter
has an impact on computing speed, since it controls when the algorithm stops after error convergence. It should be adapted
with regard of the error values. For centered and scaled matrix, a value of |
earlyStopping_max |
Number of iterations allowed when the cross-validation error starts increasing. This parameter has an impact on computing speed, since iterations corresponding to increasing error are usually quite slow. |
penalty |
Type of penalty used : can be lasso penalty or SCAD penalty |
mode |
ADMM or HM |
It is highly recommended to use center.Z = TRUE for the algorithm to work in the case of missing data.
It is recommended to use center.Z = TRUE, scale.Z = TRUE, center.y = TRUE and scale.y = TRUE for both convergence
and interpretability reasons. The use of center.Z = TRUE in the additive error setting can be subject to discussion,
as it may introduce bias in the algorithm.
For computing speed reasons, if model is not converging or running slow, consider changing mu
, decreasing
etol
or optTol
or decreasing earlyStopping_max
list containing
lambda.opt
optimal value of lambda corresponding to minimum error
lambda.sd
Value of lambda corresponding to error higher than minimum error by one standard deviation
beta.opt
Value of beta corresponding to lambda.opt
beta.sd
Value of beta corresponding to lambda.sd
data_error
Dataframe containing errors and their standard deviation for each iteration of the algorithm
data_beta
Dataframe containing the values of beta for each iteration of the algorithm
earlyStopping
Integer containing the value of iteration when early stopping happens
vnames
Names of the features
mean.Z
Mean of Z matrix without the NAs values
sd.Z
Standard deviation of Z matrix without the NAs values
mean.y
Mean of y matrix
sd.y
Standard deviation of y matrix
https://arxiv.org/pdf/1510.07123.pdf, blockwise_coordinate_descent
, pathwise_coordinate_descent
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.