mpm.rfc: Morphonode default Random Forest Classifier (RFC) endemble

mpm.rfcR Documentation

Morphonode default Random Forest Classifier (RFC) endemble

Description

RFC ensemble of 5 classifiers, based on 948 simulated ultrasound profiles (440 malignant and 508 non-malignant). The simulated dataset is divided into 5 random subsets and a nested 5-fold cross-validation (CV) is performed. For each CV cycle, 4/5 partitions are used as training set and the last one as validation set. Each RFC is trained over 10000 bootstrap trees, with 3/14 randomly chosen variables per tree branching. Bootstrapping enables independent prediction error estimation, using out-of-bag (OOB) samples. OOB error estimation allows the claculation of mean decrease accuracy (MDA) and mean decrease in Gini impurity (MDG). measuring ultrasound feature-level contribution in RFC predictive accuracy and classification entropy, respectively. These measures enable ultrasound feature ranking, based on the average of minmax-normalized MDA and MDG values (i.e., the most important feature scores 100, while the least important tends to 0). Each CV cycle provides a dichotomous phenotype classification in malignant (y = 1) and non-malignant (y = 0), OOB prediction error estimation, and subject-level estimation of the prediction uncertainty through Brier scores calculation. Given a new ultrasound profile, the resulting 5 RFCs yield independent predictions and the majority wins, whith higher priority to the RFCs with smaller OOB error.

Usage

mpm.rfc

Format

"mpm.rfc" is a list of 5 objects:

  1. "training", a list of 5 data.frames (T1-5) corresponding to the 5 RFC training sets;

  2. "validation", a list of 5 data.frames (V1-5) corresponding to the 5 RFC validation sets;

  3. "rfc", a list of 5 randomForest objects, corresponding to the 5 classifiers of the ensemble;

  4. "ranking", a data.frame reporting MDA and MDG values, as well as their minmax-normalized values (fA and fG, respectively), and the overall ranking score (f) being the average of fA and fG;

  5. "performance", a list of 7 values summarizing the RFC ensemble performances, including: 2x2 contingency table, sensitivity, specificity, precision (PPV), negative predictive value (NPV), F1 score, and predictive accuracy.

References

Fragomeni SM, Moro F, Palluzzi F, Mascilini F, Rufini V, Collarino A, Inzani F, Giacò L, Scambia G, Testa AC, Garganese G (2022). Evaluating the risk of inguinal lymph node metastases before surgery using the Morphonode Predictive Model: a prospective diagnostic study. Ultrasound xx Xxxxxxxxxx xxx Xxxxxxxxxx. 00(0):000-000. <https://doi.org/00.0000/00000000000000000000>

Examples


# Create a simulated malignant ultrasound profile
x <- new.profile(us.simulate(y = 1))

# Lauch the Morhonode Predictive Model
u <- us.predict(x)


Morphonodepredictivemodel/morphonode documentation built on Feb. 15, 2023, 4:51 a.m.