DfiMI_lasso | R Documentation |
Performs multiple imputation of the response variable Y via R independent runs and M stochastic imputations per run. Missing Y values are imputed using LASSO regression on predictors.
DfiMI_lasso(data, R, M)
data |
A
|
R |
Positive integer – number of simulation runs for stable coefficient estimation. |
M |
Positive integer – number of multiple imputations per run. |
This function extends the Distributed Full-information Multiple Imputation (DfiMI) approach by using LASSO regression for imputing missing values in the response variable Y. LASSO regression is particularly useful for high-dimensional predictor spaces and can handle multicollinearity among predictors. The function performs the following steps:
Initialize missing values in Y.
Fit LASSO regression models on complete cases.
Average coefficients across multiple imputations and runs.
Predict missing values using the final averaged coefficients.
The function requires the glmnet package for LASSO regression.
A named list containing:
Numeric vector – original Y values with missing values replaced by imputations.
Numeric vector – final regression coefficients.
set.seed(123)
data <- data.frame(
Y = c(rnorm(50), rep(NA, 10)), # 50 observed + 10 missing
X1 = rnorm(60),
X2 = rnorm(60)
)
res <- DfiMI_lasso(data, R = 3, M = 5)
head(res$Yhat)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.