cv.missoNet: Cross-validation for missoNet
In missoNet: Joint Sparse Regression & Network Learning with Missing Data

cv.missoNet

R Documentation

Cross-validation for missoNet

Description

Perform k-fold cross-validation to select the regularization pair (lambda.beta, lambda.theta) for missoNet. For each fold the model is trained on k-1 partitions and evaluated on the held-out partition over a grid of lambda pairs; the pair with minimum mean CV error is returned, with optional 1-SE models for more regularized solutions.

Usage

cv.missoNet(
  X,
  Y,
  kfold = 5,
  rho = NULL,
  lambda.beta = NULL,
  lambda.theta = NULL,
  lambda.beta.min.ratio = NULL,
  lambda.theta.min.ratio = NULL,
  n.lambda.beta = NULL,
  n.lambda.theta = NULL,
  beta.pen.factor = NULL,
  theta.pen.factor = NULL,
  penalize.diagonal = NULL,
  beta.max.iter = 10000,
  beta.tol = 1e-05,
  theta.max.iter = 10000,
  theta.tol = 1e-05,
  eta = 0.8,
  eps = 1e-08,
  standardize = TRUE,
  standardize.response = TRUE,
  compute.1se = TRUE,
  relax.net = FALSE,
  adaptive.search = FALSE,
  shuffle = TRUE,
  seed = NULL,
  parallel = FALSE,
  cl = NULL,
  verbose = 1
)

Arguments

`X`	Numeric matrix (`n \times p`). Predictors (no missing values).
`Y`	Numeric matrix (`n \times q`). Responses. Missing values should be coded as `NA`/`NaN`.
`kfold`	Integer `\ge 2`. Number of folds (default `5`).
`rho`	Optional numeric vector of length `q`. Working missingness probabilities (per response). If `NULL` (default), estimated from `Y`.
`lambda.beta`, `lambda.theta`	Optional numeric vectors. Candidate regularization paths for `\mathbf{B}` and `\Theta`. If `NULL`, sequences are generated automatically from the data. Avoid supplying a single value because warm starts along a path are used.
`lambda.beta.min.ratio`, `lambda.theta.min.ratio`	Optional numerics in `(0,1]`. Ratio of the smallest to the largest value when generating lambda sequences (ignored if the corresponding `lambda.*` is supplied).
`n.lambda.beta`, `n.lambda.theta`	Optional integers. Lengths of the automatically generated lambda paths (ignored if the corresponding `lambda.*` is supplied).
`beta.pen.factor`	Optional `p \times q` non-negative matrix of element-wise penalty multipliers for `\mathbf{B}`. `Inf` = maximum penalty; `0` = no penalty for the corresponding coefficient. Default: all 1s (equal penalty).
`theta.pen.factor`	Optional `q \times q` non-negative matrix of element-wise penalty multipliers for `\Theta`. Off-diagonal entries control edge penalties; diagonal treatment is governed by `penalize.diagonal`. `Inf` = maximum penalty; `0` = no penalty for that element. Default: all 1s (equal penalty).
`penalize.diagonal`	Logical or `NULL`. Whether to penalize diagonal entries of `\Theta`. If `NULL` (default) the choice is made automatically.
`beta.max.iter`, `theta.max.iter`	Integers. Max iterations for the `\mathbf{B}` update (FISTA) and `\Theta` update (graphical lasso). Defaults: `10000`.
`beta.tol`, `theta.tol`	Numerics `> 0`. Convergence tolerances for the `\mathbf{B}` and `\Theta` updates. Defaults: `1e-5`.
`eta`	Numeric in `(0,1)`. Backtracking line-search parameter for the `\mathbf{B}` update (default `0.8`).
`eps`	Numeric in `(0,1)`. Eigenvalue floor used to stabilize positive definiteness operations (default `1e-8`).
`standardize`	Logical. Standardize columns of `X` internally? Default `TRUE`.
`standardize.response`	Logical. Standardize columns of `Y` internally? Default `TRUE`.
`compute.1se`	Logical. Also compute 1-SE solutions? Default `TRUE`.
`relax.net`	(Experimental) Logical. If `TRUE`, refit active edges of `\Theta` without `\ell_1` penalty (de-biased network). Default `FALSE`.
`adaptive.search`	(Experimental) Logical. Use adaptive two-stage lambda search? Default `FALSE`.
`shuffle`	Logical. Randomly shuffle fold assignments? Default `TRUE`.
`seed`	Optional integer seed (used when `shuffle = TRUE`).
`parallel`	Logical. Evaluate folds in parallel using a provided cluster? Default `FALSE`.
`cl`	Optional cluster from `parallel::makeCluster()` (required if `parallel = TRUE`).
`verbose`	Integer in `0,1,2`. `0` = silent, `1` = progress (default), `2` = detailed tracing (not supported in parallel mode).

Details

Internally, predictors X and responses Y can be standardized for optimization; all reported estimates are re-scaled back to the original data scale. Missingness in Y is handled via unbiased estimating equations using column-wise observation probabilities estimated from Y (or supplied via rho). This is appropriate when the missingness of each response is independent of its unobserved value (e.g., MCAR).

If adaptive.search = TRUE, a fast two-stage pre-optimization narrows the lambda grid before computing fold errors on a focused neighborhood; this can be substantially faster on large grids but may occasionally miss the global optimum.

When compute.1se = TRUE, two additional solutions are reported: the largest lambda.beta and the largest lambda.theta whose CV error is within one standard error of the minimum (holding the other lambda fixed at its optimal value). At the end, three special lambda pairs are identified:

lambda.min: Parameters giving minimum CV error
lambda.1se.beta: Largest \lambda_B within 1 SE of minimum (with \lambda_\Theta fixed at optimum)
lambda.1se.theta: Largest \lambda_\Theta within 1 SE of minimum (with \lambda_B fixed at optimum)

The 1SE rules provide more regularized models that may generalize better.

Value

A list of class "missoNet" with components:

est.min: List of estimates at the CV minimum: Beta (p \times q), Theta (q \times q), intercept mu (length q), lambda.beta, lambda.theta, lambda.beta.idx, lambda.theta.idx, and (if requested) relax.net.
est.1se.beta: List of estimates at the 1-SE lambda.beta (if compute.1se = TRUE); NULL otherwise.
est.1se.theta: List of estimates at the 1-SE lambda.theta (if compute.1se = TRUE); NULL otherwise.
rho: Length-q vector of working missingness probabilities.
kfold: Number of folds used.
fold.index: Integer vector of length n giving fold assignments (names are "fold-k").
lambda.beta.seq, lambda.theta.seq: Unique lambda values explored along the grid for \mathbf{B} and \Theta.
penalize.diagonal: Logical indicating whether the diagonal of \Theta was penalized.
beta.pen.factor, theta.pen.factor: Penalty factor matrices actually used.
param_set: List with CV diagnostics: n, p, q, standardize, standardize.response, mean errors cv.errors.mean, bounds cv.errors.upper/lower, and the evaluated grids cv.grid.beta, cv.grid.theta (length equals number of fitted models).

Author(s)

Yixiao Zeng yixiao.zeng@mail.mcgill.ca, Celia M. T. Greenwood

References

Zeng, Y., et al. (2025). Multivariate regression with missing response data for modelling regional DNA methylation QTLs. arXiv:2507.05990.

Examples

sim <- generateData(n = 120, p = 12, q = 6, rho = 0.1)
X <- sim$X; Y <- sim$Z


# Basic 5-fold cross-validation
cvfit <- cv.missoNet(X = X, Y = Y, kfold = 5, verbose = 0)

# Extract optimal estimates
Beta.min <- cvfit$est.min$Beta
Theta.min <- cvfit$est.min$Theta

# Extract 1SE estimates (if computed)
if (!is.null(cvfit$est.1se.beta)) {
  Beta.1se <- cvfit$est.1se.beta$Beta
}
if (!is.null(cvfit$est.1se.theta)) {
  Theta.1se <- cvfit$est.1se.theta$Theta
}

# Make predictions
newX <- matrix(rnorm(10 * 12), 10, 12)
pred.min <- predict(cvfit, newx = newX, s = "lambda.min")
pred.1se <- predict(cvfit, newx = newX, s = "lambda.1se.beta")

# Parallel cross-validation
library(parallel)
cl <- makeCluster(min(detectCores() - 1, 2))
cvfit2 <- cv.missoNet(X = X, Y = Y, kfold = 5, 
                      parallel = TRUE, cl = cl)
stopCluster(cl)

# Adaptive search for efficiency
cvfit3 <- cv.missoNet(X = X, Y = Y, kfold = 5,
                      adaptive.search = TRUE)

# Reproducible CV with specific lambdas
cvfit4 <- cv.missoNet(X = X, Y = Y, kfold = 5,
                      lambda.beta = 10^seq(0, -2, length = 20),
                      lambda.theta = 10^seq(0, -2, length = 20),
                      seed = 486)

# Plot CV results
plot(cvfit, type = "heatmap")
plot(cvfit, type = "scatter")

missoNet documentation built on Sept. 9, 2025, 5:55 p.m.