Multi-task regression and network estimation with missing responses — no imputation required!
missoNet
jointly estimates regression coefficients and the response network (precision matrix) from multi-response data where some responses are missing (MCAR/MAR/MNAR). Estimation is based on unbiased estimating equations with separate L1 regularization for coefficients and the precision matrix, enabling robust multi-trait analysis under incomplete outcomes.
Beta
) and conditional dependency structure (Theta
).If you only have a single response, classical lasso/elastic net (e.g.,
glmnet
) is simpler and likely faster.
CRAN (stable)
install.packages("missoNet")
GitHub (development)
# install.packages("devtools")
devtools::install_github("yixiao-zeng/missoNet", build_vignettes = TRUE)
library(missoNet)
# Example data with ~15% missing responses (MCAR)
sim <- generateData(n = 300, p = 50, q = 10, rho = 0.15, missing.type = "MCAR")
# Fit along two lambda paths; choose via BIC (no CV)
fit <- missoNet(X = sim$X, Y = sim$Z, GoF = "BIC")
# Extract estimates at the selected solution
Beta <- fit$est.min$Beta # p x q regression coefficients
Theta <- fit$est.min$Theta # q x q precision (conditional network)
# Visualize selection path
plot(fit, type = "scatter")
# 5-fold CV over (lambda.beta, lambda.theta)
cvfit <- cv.missoNet(X = sim$X, Y = sim$Z, kfold = 5)
# Inspect CV heatmap and selected models (min and 1-SE variants)
plot(cvfit, type = "heatmap")
# Predict responses on new data
Y_hat <- predict(cvfit, newx = sim$X, s = "lambda.min")
Tip: Try s = "lambda.1se.beta"
or "lambda.1se.theta"
for more conservative sparsity when available.
library(parallel)
cl <- makeCluster(max(1, detectCores() - 1))
cvfit <- cv.missoNet(X = sim$X, Y = sim$Z, kfold = 5,
parallel = TRUE, cl = cl)
stopCluster(cl)
# Lessen the penalty for prior-important predictors
p <- ncol(sim$X); q <- ncol(sim$Z)
beta.pen.factor <- matrix(1, p, q)
beta.pen.factor[c(1, 2), ] <- 0.1
fit <- missoNet(X = sim$X, Y = sim$Z,
beta.pen.factor = beta.pen.factor)
fit <- missoNet(X = sim$X, Y = sim$Z,
adaptive.search = TRUE,
n.lambda.beta = 50,
n.lambda.theta = 50)
vignette("missoNet-introduction")
vignette("missoNet-cross-validation")
vignette("missoNet-case-study")
If vignettes are not available from CRAN binaries on your platform, install from source using the GitHub command above with build_vignettes = TRUE
.
Actual performance will depend on sparsity, signal-to-noise, and missingness mechanisms.
Great for
Not ideal for
- Single-response regression (use glmnet
or similar)
- Extremely sparse information (e.g., >50% missing responses across most traits)
If you use missoNet
in your research, please cite:
@article{zeng2025missonet,
title = {Multivariate regression with missing response data for modelling regional DNA methylation QTLs},
author = {Zeng, Yixiao and Alam, Shomoita and Bernatsky, Sasha and Hudson, Marie and Colmegna, In{\'e}s and Stephens, David A and Greenwood, Celia MT and Yang, Archer Y},
journal = {arXiv preprint arXiv:2507.05990},
year = {2025},
url = {https://arxiv.org/abs/2507.05990}
}
Contributions and issues are welcome! Please open a discussion or pull request on the GitHub repository.
GPL-2. See the LICENSE file.
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.