cv.compCL: Cross-validation for compCL
In Zhe-Research/compReg: Compositional Data Regression

Description Usage Arguments Value Examples

Does nfolds cross-validation for compCL, return value of lam. The function is modified based on the cv function from glmnet package

1 2	cv.compCL(y, Z, Zc = NULL, intercept = FALSE, lam = NULL, nfolds = 10, foldid, trim = 0.1, ...)

`y`	a vector of response variable with length n.
`Z`	a np* matrix after taking log transformation on compositional data.
`Zc`	a design matrix of other covariates considered. Default is `NULL`.
`intercept`	Whether to include intercept in the model. Default is TRUE.
`lam`	a user supplied lambda sequence. Typically, by leaving this option unspecified users can have the program compute its own `lam` sequence based on `nlam` and `lambda.factor` If `lam` is provided but a scaler, `lam` sequence is also created starting from `lam`. Supplying a value of lambda overrides this. It is better to supply a decreasing sequence of lambda values, if not, the program will sort user-defined `lambda` sequence in decreasing order automatically.
`nfolds`	number of folds - default is 10. Smallest value allowable is nfolds=3.
`foldid`	an optional vector of values between 1 and `nfolds` identifying what fold each observation is in. If supplied, `nfold` can be missing.
`trim`	a scaler specifying percentage to be trimmed off for prediction error - default is 0.
`...`	other arguments that can be passed to compCL.

an object of class cv.compCL is returned.

`compCL.fit`	a fitted `compCL` object for the full data
`lam`	the values of `lam` used in the fits
`Ftrim`	a list of cross-validation result without trimming. `cvm` the mean cross-validated error without trimming - a vector of `length(lam)` `cvsd` estimate of standard error of cvm without trimming- a vector of `llength(lam)` `cvupper` upper curve = `cvm+cvsd`. `cvlo` lower curve = `cvm-cvsd`. `lam.min` The optimal value of `lam` that gives minimum cross validation error `cvm` `lam.1se` The largest value of lam such that error is within 1 standard error of the minimum `cvm`
`Ttrim`	a list of cross-validation result with `trim100%`, if provided, of tails trimmed off for cross validation error. `cvm` the mean cross-validated error with with `trim100%` trimmed - a vector of `length(lam)` `cvsd` estimate of standard error of cvm with `trim100%` trimmed - a vector of `length(lam)` `cvupper` upper curve = `cvm+cvsd`. `cvlo` lower curve = `cvm-cvsd`. `lam.min` The optimal value of `lam` that gives minimum cross validation error cvm with `trim100%` trimmed `lam.1se` The largest value of lam such that error is within 1 standard error of the minimum `cvm` after `trim*100%` trimmed.
`foldid`	the values of `folidi` used in fits.

p = 30
n = 50
beta = c(1, -0.8, 0.6, 0, 0, -1.5, -0.5, 1.2)
beta = c( beta, rep(0, times = p - length(beta)) )
Comp_data = comp_simulation(n = n, p = p,
                            rho = 0.2, sigma = 0.5,
                            gamma  = 0.5, add.on = 1:5,
                            beta = beta, intercept = FALSE)
Comp_data$Zc
cvm <- cv.compCL(y = Comp_data$y,
                 Z = Comp_data$X.comp, Zc = Comp_data$Zc,
                 intercept = Comp_data$intercept,
                 lam = NULL, nfolds = 10, trim = 0.05, lambda.factor = 0.0001,
                 dfmax = p, mu_ratio = 1, outer_eps = 1e-10, inner_eps = 1e-8, inner_maxiter = 1e4)

plot(cvm)
coef(cvm, s = "lam.min")
cvm$compCL.fit
#apply(cvm$compCL.fit$beta[1:p, ], 2, function(x) which(abs(x) > 0))
which(abs(coef(cvm, s = "lam.min")$beta[1:p]) > 0)
which(abs(coef(cvm, s= "lam.1se")$beta[1:p]) > 0)