cv.missoNet: Cross-validation for missoNet

View source: R/cv.missoNet.R

cv.missoNetR Documentation

Cross-validation for missoNet

Description

Perform k-fold cross-validation to select the regularization pair (lambda.beta, lambda.theta) for missoNet. For each fold the model is trained on k-1 partitions and evaluated on the held-out partition over a grid of lambda pairs; the pair with minimum mean CV error is returned, with optional 1-SE models for more regularized solutions.

Usage

cv.missoNet(
  X,
  Y,
  kfold = 5,
  rho = NULL,
  lambda.beta = NULL,
  lambda.theta = NULL,
  lambda.beta.min.ratio = NULL,
  lambda.theta.min.ratio = NULL,
  n.lambda.beta = NULL,
  n.lambda.theta = NULL,
  beta.pen.factor = NULL,
  theta.pen.factor = NULL,
  penalize.diagonal = NULL,
  beta.max.iter = 10000,
  beta.tol = 1e-05,
  theta.max.iter = 10000,
  theta.tol = 1e-05,
  eta = 0.8,
  eps = 1e-08,
  standardize = TRUE,
  standardize.response = TRUE,
  compute.1se = TRUE,
  relax.net = FALSE,
  adaptive.search = FALSE,
  shuffle = TRUE,
  seed = NULL,
  parallel = FALSE,
  cl = NULL,
  verbose = 1
)

Arguments

X

Numeric matrix (n \times p). Predictors (no missing values).

Y

Numeric matrix (n \times q). Responses. Missing values should be coded as NA/NaN.

kfold

Integer \ge 2. Number of folds (default 5).

rho

Optional numeric vector of length q. Working missingness probabilities (per response). If NULL (default), estimated from Y.

lambda.beta, lambda.theta

Optional numeric vectors. Candidate regularization paths for \mathbf{B} and \Theta. If NULL, sequences are generated automatically from the data. Avoid supplying a single value because warm starts along a path are used.

lambda.beta.min.ratio, lambda.theta.min.ratio

Optional numerics in (0,1]. Ratio of the smallest to the largest value when generating lambda sequences (ignored if the corresponding lambda.* is supplied).

n.lambda.beta, n.lambda.theta

Optional integers. Lengths of the automatically generated lambda paths (ignored if the corresponding lambda.* is supplied).

beta.pen.factor

Optional p \times q non-negative matrix of element-wise penalty multipliers for \mathbf{B}. Inf = maximum penalty; 0 = no penalty for the corresponding coefficient. Default: all 1s (equal penalty).

theta.pen.factor

Optional q \times q non-negative matrix of element-wise penalty multipliers for \Theta. Off-diagonal entries control edge penalties; diagonal treatment is governed by penalize.diagonal. Inf = maximum penalty; 0 = no penalty for that element. Default: all 1s (equal penalty).

penalize.diagonal

Logical or NULL. Whether to penalize diagonal entries of \Theta. If NULL (default) the choice is made automatically.

beta.max.iter, theta.max.iter

Integers. Max iterations for the \mathbf{B} update (FISTA) and \Theta update (graphical lasso). Defaults: 10000.

beta.tol, theta.tol

Numerics > 0. Convergence tolerances for the \mathbf{B} and \Theta updates. Defaults: 1e-5.

eta

Numeric in (0,1). Backtracking line-search parameter for the \mathbf{B} update (default 0.8).

eps

Numeric in (0,1). Eigenvalue floor used to stabilize positive definiteness operations (default 1e-8).

standardize

Logical. Standardize columns of X internally? Default TRUE.

standardize.response

Logical. Standardize columns of Y internally? Default TRUE.

compute.1se

Logical. Also compute 1-SE solutions? Default TRUE.

relax.net

(Experimental) Logical. If TRUE, refit active edges of \Theta without \ell_1 penalty (de-biased network). Default FALSE.

adaptive.search

(Experimental) Logical. Use adaptive two-stage lambda search? Default FALSE.

shuffle

Logical. Randomly shuffle fold assignments? Default TRUE.

seed

Optional integer seed (used when shuffle = TRUE).

parallel

Logical. Evaluate folds in parallel using a provided cluster? Default FALSE.

cl

Optional cluster from parallel::makeCluster() (required if parallel = TRUE).

verbose

Integer in 0,1,2. 0 = silent, 1 = progress (default), 2 = detailed tracing (not supported in parallel mode).

Details

Internally, predictors X and responses Y can be standardized for optimization; all reported estimates are re-scaled back to the original data scale. Missingness in Y is handled via unbiased estimating equations using column-wise observation probabilities estimated from Y (or supplied via rho). This is appropriate when the missingness of each response is independent of its unobserved value (e.g., MCAR).

If adaptive.search = TRUE, a fast two-stage pre-optimization narrows the lambda grid before computing fold errors on a focused neighborhood; this can be substantially faster on large grids but may occasionally miss the global optimum.

When compute.1se = TRUE, two additional solutions are reported: the largest lambda.beta and the largest lambda.theta whose CV error is within one standard error of the minimum (holding the other lambda fixed at its optimal value). At the end, three special lambda pairs are identified:

  • lambda.min: Parameters giving minimum CV error

  • lambda.1se.beta: Largest \lambda_B within 1 SE of minimum (with \lambda_\Theta fixed at optimum)

  • lambda.1se.theta: Largest \lambda_\Theta within 1 SE of minimum (with \lambda_B fixed at optimum)

The 1SE rules provide more regularized models that may generalize better.

Value

A list of class "missoNet" with components:

est.min

List of estimates at the CV minimum: Beta (p \times q), Theta (q \times q), intercept mu (length q), lambda.beta, lambda.theta, lambda.beta.idx, lambda.theta.idx, and (if requested) relax.net.

est.1se.beta

List of estimates at the 1-SE lambda.beta (if compute.1se = TRUE); NULL otherwise.

est.1se.theta

List of estimates at the 1-SE lambda.theta (if compute.1se = TRUE); NULL otherwise.

rho

Length-q vector of working missingness probabilities.

kfold

Number of folds used.

fold.index

Integer vector of length n giving fold assignments (names are "fold-k").

lambda.beta.seq, lambda.theta.seq

Unique lambda values explored along the grid for \mathbf{B} and \Theta.

penalize.diagonal

Logical indicating whether the diagonal of \Theta was penalized.

beta.pen.factor, theta.pen.factor

Penalty factor matrices actually used.

param_set

List with CV diagnostics: n, p, q, standardize, standardize.response, mean errors cv.errors.mean, bounds cv.errors.upper/lower, and the evaluated grids cv.grid.beta, cv.grid.theta (length equals number of fitted models).

Author(s)

Yixiao Zeng yixiao.zeng@mail.mcgill.ca, Celia M. T. Greenwood

References

Zeng, Y., et al. (2025). Multivariate regression with missing response data for modelling regional DNA methylation QTLs. arXiv:2507.05990.

See Also

missoNet for model fitting; generic methods such as plot() and predict() for objects of class "missoNet".

Examples

sim <- generateData(n = 120, p = 12, q = 6, rho = 0.1)
X <- sim$X; Y <- sim$Z


# Basic 5-fold cross-validation
cvfit <- cv.missoNet(X = X, Y = Y, kfold = 5, verbose = 0)

# Extract optimal estimates
Beta.min <- cvfit$est.min$Beta
Theta.min <- cvfit$est.min$Theta

# Extract 1SE estimates (if computed)
if (!is.null(cvfit$est.1se.beta)) {
  Beta.1se <- cvfit$est.1se.beta$Beta
}
if (!is.null(cvfit$est.1se.theta)) {
  Theta.1se <- cvfit$est.1se.theta$Theta
}

# Make predictions
newX <- matrix(rnorm(10 * 12), 10, 12)
pred.min <- predict(cvfit, newx = newX, s = "lambda.min")
pred.1se <- predict(cvfit, newx = newX, s = "lambda.1se.beta")

# Parallel cross-validation
library(parallel)
cl <- makeCluster(min(detectCores() - 1, 2))
cvfit2 <- cv.missoNet(X = X, Y = Y, kfold = 5, 
                      parallel = TRUE, cl = cl)
stopCluster(cl)

# Adaptive search for efficiency
cvfit3 <- cv.missoNet(X = X, Y = Y, kfold = 5,
                      adaptive.search = TRUE)

# Reproducible CV with specific lambdas
cvfit4 <- cv.missoNet(X = X, Y = Y, kfold = 5,
                      lambda.beta = 10^seq(0, -2, length = 20),
                      lambda.theta = 10^seq(0, -2, length = 20),
                      seed = 486)

# Plot CV results
plot(cvfit, type = "heatmap")
plot(cvfit, type = "scatter")



missoNet documentation built on Sept. 9, 2025, 5:55 p.m.