missoNet: Fit missoNet models with missing responses

View source: R/missoNet.R

missoNetR Documentation

Fit missoNet models with missing responses

Description

Fit a penalized multi-task regression with a response-network (\Theta) under missing responses. The method jointly estimates the coefficient matrix \mathbf{B} and the precision matrix \Theta via penalized likelihood with \ell_1 penalties on \mathbf{B} and the off-diagonal entries of \Theta.

Usage

missoNet(
  X,
  Y,
  rho = NULL,
  GoF = "eBIC",
  lambda.beta = NULL,
  lambda.theta = NULL,
  lambda.beta.min.ratio = NULL,
  lambda.theta.min.ratio = NULL,
  n.lambda.beta = NULL,
  n.lambda.theta = NULL,
  beta.pen.factor = NULL,
  theta.pen.factor = NULL,
  penalize.diagonal = NULL,
  beta.max.iter = 10000,
  beta.tol = 1e-05,
  theta.max.iter = 10000,
  theta.tol = 1e-05,
  eta = 0.8,
  eps = 1e-08,
  standardize = TRUE,
  standardize.response = TRUE,
  relax.net = FALSE,
  adaptive.search = FALSE,
  parallel = FALSE,
  cl = NULL,
  verbose = 1
)

Arguments

X

Numeric matrix (n \times p). Predictors (no missing values).

Y

Numeric matrix (n \times q). Responses, may contain NA/NaN.

rho

Optional numeric vector of length q. Working missingness probabilities; if NULL (default), estimated from Y.

GoF

Character. Goodness-of-fit criterion: "AIC", "BIC", or "eBIC" (default).

lambda.beta, lambda.theta

Optional numeric vectors (or scalars). Candidate regularization paths for \mathbf{B} and \Theta. If NULL, paths are generated automatically.

lambda.beta.min.ratio, lambda.theta.min.ratio

Optional numerics in (0,1]. Ratio of the smallest to largest lambda when generating paths (ignored if the corresponding lambda.* is supplied).

n.lambda.beta, n.lambda.theta

Optional integers. Lengths of automatically generated lambda paths (ignored if the corresponding lambda.* is supplied).

beta.pen.factor

Optional p \times q non-negative matrix of element-wise penalty multipliers for \mathbf{B}. Inf = maximum penalty; 0 = no penalty for that coefficient. Default: all 1s (equal penalty).

theta.pen.factor

Optional q \times q non-negative matrix of element-wise penalty multipliers for \Theta. Off-diagonal entries control edge penalties; diagonal treatment is governed by penalize.diagonal. Inf = maximum penalty; 0 = no penalty for that coefficient. Default: all 1s (equal penalty).

penalize.diagonal

Logical or NULL. Whether to penalize the diagonal of \Theta. If NULL (default) the choice is made automatically.

beta.max.iter, theta.max.iter

Integers. Max iterations for the \mathbf{B} update (FISTA) and \Theta update (graphical lasso). Defaults: 10000.

beta.tol, theta.tol

Numerics > 0. Convergence tolerances for the \mathbf{B} and \Theta updates. Defaults: 1e-5.

eta

Numeric in (0,1). Backtracking line-search parameter for the \mathbf{B} update (default 0.8).

eps

Numeric in (0,1). Eigenvalue floor used to stabilize positive definiteness operations (default 1e-8).

standardize

Logical. Standardize columns of X internally? Default TRUE.

standardize.response

Logical. Standardize columns of Y internally? Default TRUE.

relax.net

(Experimental) Logical. If TRUE, refit active edges of \Theta without \ell_1 penalty (de-biased network). Default FALSE.

adaptive.search

(Experimental) Logical. Use adaptive two-stage lambda search? Default FALSE.

parallel

Logical. Evaluate parts of the grid in parallel using a provided cluster? Default FALSE.

cl

Optional cluster from parallel::makeCluster() (required if parallel = TRUE).

verbose

Integer in 0,1,2. 0 = silent, 1 = progress (default), 2 = detailed tracing (not supported in parallel mode).

Details

The conditional Gaussian model is

Y_i = \mu + X_i \mathbf{B} + E_i, \qquad E_i \sim \mathcal{N}_q(0, \Theta^{-1}).

where:

  • Y_i is the i-th observation of q responses

  • X_i is the i-th observation of p predictors

  • \mathbf{B} is the p \times q coefficient matrix

  • \Theta is the q \times q precision matrix

  • \mu is the intercept vector

The parameters are estimated by solving:

\min_{\mathbf{B}, \Theta \succ 0} \quad g(\mathbf{B}, \Theta) + \lambda_B \|\mathbf{B}\|_1 + \lambda_\Theta \|\Theta\|_{1,\mathrm{off}}

where g is the negative log-likelihood.

Missing values in Y are accommodated through unbiased estimating equations using column-wise observation probabilities. Internally, X and Y may be standardized for numerical stability; returned estimates are re-scaled back to the original units.

The grid search spans lambda.beta and lambda.theta. The optimal pair is selected by the user-chosen goodness-of-fit criterion GoF: "AIC", "BIC", or "eBIC" (default). If adaptive.search = TRUE, a two-stage pre-optimization narrows the grid before the main search (faster on large problems, with a small risk of missing the global optimum).

Value

A list of class "missoNet" with components:

est.min

List at the selected lambda pair: Beta (p \times q), Theta (q \times q), intercept mu (length q), lambda.beta, lambda.theta, lambda.beta.idx, lambda.theta.idx, scalar gof (AIC/BIC/eBIC at optimum), and (if requested) relax.net.

rho

Length-q vector of working missingness probabilities.

lambda.beta.seq, lambda.theta.seq

Unique lambda values explored along the grid for \mathbf{B} and \Theta.

penalize.diagonal

Logical indicating whether the diagonal of \Theta was penalized.

beta.pen.factor, theta.pen.factor

Penalty factor matrices actually used.

param_set

List with fitting diagnostics: n, p, q, standardize, standardize.response, the vector of criterion values gof, and the evaluated grids gof.grid.beta, gof.grid.theta (length equals number of fitted models).

Author(s)

Yixiao Zeng yixiao.zeng@mail.mcgill.ca, Celia M. T. Greenwood

References

Zeng, Y., et al. (2025). Multivariate regression with missing response data for modelling regional DNA methylation QTLs. arXiv:2507.05990.

See Also

cv.missoNet for cross-validated selection; generic methods such as plot() and predict() for objects of class "missoNet".

Examples

sim <- generateData(n = 120, p = 10, q = 6, rho = 0.1)
X <- sim$X; Y <- sim$Z


# Fit with defaults (criterion = eBIC)
fit1 <- missoNet(X, Y)
# Extract the optimal estimates
Beta.hat <- fit1$est.min$Beta
Theta.hat <- fit1$est.min$Theta

# Plot missoNet results
plot(fit1, type = "heatmap")
plot(fit1, type = "scatter")

# Provide short lambda paths
fit2 <- missoNet(
  X, Y,
  lambda.beta  = 10^seq(0, -2, length.out = 5),
  lambda.theta = 10^seq(0, -2, length.out = 5),
  GoF = "BIC"
)

# Test single lambda choice
fit3 <- missoNet(
  X, Y,
  lambda.beta  = 0.1,
  lambda.theta = 0.1,
)

# De-biased network on the active set
fit4 <- missoNet(X, Y, relax.net = TRUE, verbose = 0)

# Adaptive search for large problems
fit5 <- missoNet(X = X, Y = Y, adaptive.search = TRUE, verbose = 0)

# Parallel (requires a cluster)
library(parallel)
cl <- makeCluster(2)
fit_par <- missoNet(X, Y, parallel = TRUE, cl = cl, verbose = 0)
stopCluster(cl)



missoNet documentation built on Sept. 9, 2025, 5:55 p.m.