pred_ensembel: Predict Interactions via Ensemble Learning Method

View source: R/pred_ensembel.R

pred_ensembelR Documentation

Predict Interactions via Ensemble Learning Method

Description

This function uses an ensemble of classifiers to predict interactions from the sequence-based dataset. This ensemble algorithm combines different results generated from individual classifiers within the ensemble via average to enhance prediction.

Usage

pred_ensembel(
  features,
  gold_standard,
  classifier = c("avNNet", "svmRadial", "ranger"),
  resampling.method = "cv",
  ncross = 2,
  repeats = 2,
  verboseIter = TRUE,
  plots = TRUE,
  filename = "plots.pdf"
)

Arguments

features

A data frame with host-pathogen protein-protein interactions (HP-PPIs) in the first column, and features to be passed to the classifier in the remaining columns.

gold_standard

A data frame with gold_standard HP-PPIs and class label indicating if such PPIs are positive or negative.

classifier

The type of classifier to use. See caret for the availbale classifiers.

resampling.method

The resampling method:'boot', 'boot632', 'optimism_boot', boot_all', 'cv', 'repeatedcv', 'LOOCV', 'LGOCV'; defaults to cv. See trainControl for more details.

ncross

Number of partitions for cross-validation; defaults to 5.See trainControl for more details.

repeats

for repeated k-fold cross validation only; defaults to 3.See rfeControl for more details.

verboseIter

Logical value, indicating whether to check the status of training process;defaults to FALSE.

plots

Logical value, indicating whether to plot the performance of ensemble learning algorithm as compared to individual classifiers; defaults to TRUE.If the argument set to TRUE, plots will be saved in the current working directory. These plots are :

  • pr_plot - Precision-recall plot of ensemble classifier vs selected individual classifiers.

  • roc_plot - ROC plot of ensemble classifier vs selected individual classifiers.

  • points_plot - Plot accuracy, F1-score ,positive predictive value (PPV),sensitivity (SE), and Matthews correlation coefficient (MCC) of ensemble classifier vs selected individual classifiers.

filename

A character string, indicating the output filename as an pdf object.

Details

pred_ensembel

Value

Ensemble_training_output

  • prediction score - Prediction scores for whole dataset from each individual classifier.

  • Best - Selected hyper parameters.

  • Parameter range - Tested hyper parameters.

  • prediction_score_test - Scores probabilities for test data from each individual classifier.

  • class_label - Class probabilities for test data from each individual classifier.

classifier_performance

  • cm - A confusion matrix.

  • ACC - Accuracy.

  • SE - Sensitivity.

  • SP - Specificity.

  • PPV - Positive Predictive Value.

  • F1 - F1-score.

  • MCC - Matthews correlation coefficient.

  • Roc_Object - A list of elements. See roc for more details.

  • PR_Object - A list of elements. See pr.curve for more details.

predicted_interactions - The input data frame of pairwise interactions, including classifier scores averaged across all models.

Author(s)

Matineh Rahmatbakhsh, matinerb.94@gmail.com

Examples

data('example_data')
features <- example_data[, -2]
gd <- example_data[, c(1,2)]
gd <- na.omit(gd)
ppi <-pred_ensembel(features,gd,
classifier = c("avNNet", "svmRadial", "ranger"),
resampling.method = "cv",ncross = 2,verboseIter = FALSE,plots = FALSE,
filename = "plots.pdf")
#extract predicted interactions
pred_interaction <- ppi[["predicted_interactions"]]

mrbakhsh/HPiP documentation built on March 28, 2023, 4:35 p.m.