2. Diagnostic Workflow
In E2E: Ensemble Learning Framework for Diagnostic and Prognostic Modeling

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

library(E2E)

# Set up a 2-core cluster for parallel processing in this vignette
# This is crucial for passing R CMD check on CI/CD platforms
cl <- parallel::makeCluster(2)
doParallel::registerDoParallel(cl)

Diagnostic Models (Classification)

This track is dedicated to binary classification tasks.

1. Initialization

First, initialize the diagnostic modeling system. This registers all built-in classification models.

initialize_modeling_system_dia()

2. Training Single Models with `models_dia`

The models_dia function is the gateway to training one or more standard classification models.

Basic Usage

By default, models_dia runs all registered models. For this demonstration, we'll run a subset to save time.

# To run all, use model = "all_dia" or omit the parameter.
results_all_dia <- models_dia(train_dia, model = c("rf", "lasso", "xb"))

# Print a summary for a specific model (e.g., Random Forest)
print_model_summary_dia("rf", results_all_dia$rf)

Advanced Usage & Customization

You can precisely control the modeling process by specifying parameters.

# Run a specific subset of models with tuning enabled and custom thresholds
results_dia_custom <- models_dia(
  data = train_dia,
  model = c("rf", "lasso", "xb"),
  tune = TRUE,
  seed = 123,
  threshold_choices = list(rf = "f1", lasso = 0.6, xb = "youden"),
  positive_label_value = 1,
  negative_label_value = 0,
  new_positive_label = "Case",
  new_negative_label = "Control"
)

# View the custom results
print_model_summary_dia("rf", results_dia_custom$rf)

3. Ensemble Modeling

Bagging (`bagging_dia`)

Builds a Bagging ensemble by training a base model on multiple bootstrap samples.

# Create a Bagging ensemble with XGBoost as the base model
# n_estimators is reduced for faster execution in this example.
bagging_xb_results <- bagging_dia(train_dia, base_model_name = "xb", n_estimators = 5)
print_model_summary_dia("Bagging (XGBoost)", bagging_xb_results)

Voting (`voting_dia`)

Combines predictions from multiple pre-trained models.

# Create a soft voting ensemble from the top models
voting_soft_results <- voting_dia(
  results_all_models = results_all_dia,
  data = train_dia,
  type = "soft"
)
print_model_summary_dia("Voting (Soft)", voting_soft_results)

Stacking (`stacking_dia`)

Uses predictions from base models as features to train a final meta-model.

# Create a Stacking ensemble with Lasso as the meta-model
stacking_lasso_results <- stacking_dia(
  results_all_models = results_all_dia,
  data = train_dia,
  meta_model_name = "lasso"
)
print_model_summary_dia("Stacking (Lasso)", stacking_lasso_results)

Handling Imbalanced Data (`imbalance_dia`)

Implements the EasyEnsemble algorithm.

# Create an EasyEnsemble with XGBoost as the base model
# n_estimators is reduced for faster execution.
results_imbalance_dia <- imbalance_dia(train_dia, base_model_name = "xb", n_estimators = 5, seed = 123)
print_model_summary_dia("Imbalance (XGBoost)", results_imbalance_dia)

4. Applying Models to New Data (`apply_dia`)

Use a trained model object to make predictions on a new, unseen dataset.

# Apply the trained Bagging model to the test set
bagging_pred_new <- apply_dia(
  trained_model_object = bagging_xb_results$model_object,
  new_data = test_dia,
  label_col_name = "outcome",
  pos_class = "Positive",
  neg_class = "Negative"
)

# Evaluate these new predictions
eval_results_new <- evaluate_model_dia(
  precomputed_prob = bagging_pred_new$score,
  y_data = factor(test_dia$outcome, levels = c(0, 1), labels = c("Positive", "Negative")),
  sample_ids = test_dia$sample,
  threshold_strategy = "default",
  pos_class = "Positive",
  neg_class = "Negative"
)
print(eval_results_new$evaluation_metrics)

5. Visualization (`figure_dia`)

Generate high-quality plots to evaluate model performance.

# ROC Curve
p1 <- figure_dia(type = "roc", data = results_imbalance_dia)
#plot(p1)

# Precision-Recall Curve
p2 <- figure_dia(type = "prc", data = results_imbalance_dia)
#plot(p2)

# Confusion Matrix
p3 <- figure_dia(type = "matrix", data = results_imbalance_dia)
#plot(p3)

# Stop the parallel cluster
parallel::stopCluster(cl)

Any scripts or data that you put into this service are public.

E2E documentation built on Aug. 27, 2025, 1:09 a.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

E2E
Ensemble Learning Framework for Diagnostic and Prognostic Modeling

2. Diagnostic Workflow
In E2E: Ensemble Learning Framework for Diagnostic and Prognostic Modeling

Diagnostic Models (Classification)

1. Initialization

2. Training Single Models with `models_dia`

Basic Usage

Advanced Usage & Customization

3. Ensemble Modeling

Bagging (`bagging_dia`)

Voting (`voting_dia`)

Stacking (`stacking_dia`)

Handling Imbalanced Data (`imbalance_dia`)

4. Applying Models to New Data (`apply_dia`)

5. Visualization (`figure_dia`)

Try the E2E package in your browser

R Package Documentation

Browse R Packages

We want your feedback!

E2E Ensemble Learning Framework for Diagnostic and Prognostic Modeling

2. Diagnostic Workflow In E2E: Ensemble Learning Framework for Diagnostic and Prognostic Modeling

Diagnostic Models (Classification)

1. Initialization

2. Training Single Models with models_dia

Basic Usage

Advanced Usage & Customization

3. Ensemble Modeling

Bagging (bagging_dia)

Voting (voting_dia)

Stacking (stacking_dia)

Handling Imbalanced Data (imbalance_dia)

4. Applying Models to New Data (apply_dia)

5. Visualization (figure_dia)

Try the E2E package in your browser

R Package Documentation

Browse R Packages

We want your feedback!

E2E
Ensemble Learning Framework for Diagnostic and Prognostic Modeling

2. Diagnostic Workflow
In E2E: Ensemble Learning Framework for Diagnostic and Prognostic Modeling

2. Training Single Models with `models_dia`

Bagging (`bagging_dia`)

Voting (`voting_dia`)

Stacking (`stacking_dia`)

Handling Imbalanced Data (`imbalance_dia`)

4. Applying Models to New Data (`apply_dia`)

5. Visualization (`figure_dia`)