coco: Coco

Description Usage Arguments Details Value See Also

View source: R/coco.R

Description

Implement blockwise coordinate descent algorithm or CoCoLasso algorithm

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
coco(
  Z,
  y,
  n,
  p,
  p1 = NULL,
  p2 = NULL,
  center.Z = TRUE,
  scale.Z = TRUE,
  center.y = TRUE,
  scale.y = TRUE,
  lambda.factor = ifelse(dim(Z)[1] < dim(Z)[2], 0.01, 0.001),
  step = 100,
  K = 4,
  mu = 10,
  tau = NULL,
  etol = 1e-04,
  optTol = 1e-05,
  earlyStopping_max = 10,
  noise = c("additive", "missing"),
  block = TRUE,
  penalty = c("lasso", "SCAD"),
  mode = "ADMM"
)

Arguments

Z

Corrupted design matrix (with additive error or missing data)

y

Response vector

n

Number of samples of the design matrix

p

Number of features of the matrix

p1

Number of uncorrupted predictors (if dealing with block descent)

p2

Number of corrupted predictors (if dealing with block descent)

center.Z

If TRUE, centers Z matrix without taking into account NAs values, and then change NAs to 0 value (in the missing data setting).

scale.Z

If TRUE, divides Z columns by their standard deviation

center.y

If TRUE, centers y

scale.y

If TRUE, divides y by its standard deviation

lambda.factor

Range of the lambda interval we are going to explore

step

Number of values of lambda in the interval we are going to test

K

Number of folds for the cross-validation

mu

Penalty parameter for the ADMM algorithm

tau

Standard deviation for the additive error matrix in the additive error setting (NULL in the missing data setting)

etol

Tolerance parameter for the ADMM algorithm. This parameter has an impact on computing speed, since it controls the number of iterations of the ADMM algorithm which can be quite slow when the number of features increases.

optTol

Tolerance parameter for the convergence of the error in the pathwise coordinate descent. This parameter has an impact on computing speed, since it controls when the algorithm stops after error convergence. It should be adapted with regard of the error values. For centered and scaled matrix, a value of optTop at 1e-5 usually works fine.

earlyStopping_max

Number of iterations allowed when the cross-validation error starts increasing. This parameter has an impact on computing speed, since iterations corresponding to increasing error are usually quite slow.

noise

Type of noise (additive or missing)

block

If TRUE, implements block descent CoCoLasso. If FALSE, implements simple CoCoLasso.

penalty

Type of penalty used : can be lasso penalty or SCAD penalty

mode

ADMM or HM

Details

It is highly recommended to use center.Z = TRUE for the algorithm to work in the case of missing data. It is recommended to use center.Z = TRUE, scale.Z = TRUE, center.y = TRUE and scale.y = TRUE for both convergence and interpretability reasons. The use of center.Z = TRUE in the additive error setting can be subject to discussion, as it may introduce bias in the algorithm. For computing speed reasons, if model is not converging or running slow, consider changing mu, decreasing etol or optTol or decreasing earlyStopping_max

Value

list containing

See Also

https://arxiv.org/pdf/1510.07123.pdf, blockwise_coordinate_descent, pathwise_coordinate_descent


celiaescribe/BDcocolasso documentation built on Feb. 11, 2020, 11:41 p.m.