pred_ensembel: Predict Interactions via Ensemble Learning Method
In mrbakhsh/HPiP: Host-Pathogen Interaction Prediction

pred_ensembel

R Documentation

Predict Interactions via Ensemble Learning Method

Description

This function uses an ensemble of classifiers to predict interactions from the sequence-based dataset. This ensemble algorithm combines different results generated from individual classifiers within the ensemble via average to enhance prediction.

Usage

pred_ensembel(
  features,
  gold_standard,
  classifier = c("avNNet", "svmRadial", "ranger"),
  resampling.method = "cv",
  ncross = 2,
  repeats = 2,
  verboseIter = TRUE,
  plots = TRUE,
  filename = "plots.pdf"
)

Arguments

`features`	A data frame with host-pathogen protein-protein interactions (HP-PPIs) in the first column, and features to be passed to the classifier in the remaining columns.
`gold_standard`	A data frame with gold_standard HP-PPIs and class label indicating if such PPIs are positive or negative.
`classifier`	The type of classifier to use. See `caret` for the availbale classifiers.
`resampling.method`	The resampling method:'boot', 'boot632', 'optimism_boot', boot_all', 'cv', 'repeatedcv', 'LOOCV', 'LGOCV'; defaults to cv. See `trainControl` for more details.
`ncross`	Number of partitions for cross-validation; defaults to 5.See `trainControl` for more details.
`repeats`	for repeated k-fold cross validation only; defaults to 3.See `rfeControl` for more details.
`verboseIter`	Logical value, indicating whether to check the status of training process;defaults to FALSE.
`plots`	Logical value, indicating whether to plot the performance of ensemble learning algorithm as compared to individual classifiers; defaults to TRUE.If the argument set to TRUE, plots will be saved in the current working directory. These plots are : pr_plot - Precision-recall plot of ensemble classifier vs selected individual classifiers. roc_plot - ROC plot of ensemble classifier vs selected individual classifiers. points_plot - Plot accuracy, F1-score ,positive predictive value (PPV),sensitivity (SE), and Matthews correlation coefficient (MCC) of ensemble classifier vs selected individual classifiers.
`filename`	A character string, indicating the output filename as an pdf object.

Details

pred_ensembel

Value

Ensemble_training_output

prediction score - Prediction scores for whole dataset from each individual classifier.
Best - Selected hyper parameters.
Parameter range - Tested hyper parameters.
prediction_score_test - Scores probabilities for test data from each individual classifier.
class_label - Class probabilities for test data from each individual classifier.

classifier_performance

cm - A confusion matrix.
ACC - Accuracy.
SE - Sensitivity.
SP - Specificity.
PPV - Positive Predictive Value.
F1 - F1-score.
MCC - Matthews correlation coefficient.
Roc_Object - A list of elements. See roc for more details.
PR_Object - A list of elements. See pr.curve for more details.

predicted_interactions - The input data frame of pairwise interactions, including classifier scores averaged across all models.

Author(s)

Matineh Rahmatbakhsh, matinerb.94@gmail.com

Examples

data('example_data')
features <- example_data[, -2]
gd <- example_data[, c(1,2)]
gd <- na.omit(gd)
ppi <-pred_ensembel(features,gd,
classifier = c("avNNet", "svmRadial", "ranger"),
resampling.method = "cv",ncross = 2,verboseIter = FALSE,plots = FALSE,
filename = "plots.pdf")
#extract predicted interactions
pred_interaction <- ppi[["predicted_interactions"]]

mrbakhsh/HPiP documentation built on March 28, 2023, 4:35 p.m.