ensemble_model: Predict Interactions via Ensemble Learning Method
In MACP: Macromolecular Assemblies from Co-Elution Profile (MACP)

ensemble_model

R Documentation

Predict Interactions via Ensemble Learning Method

Description

This function uses individual or an ensemble of classifiers to predict interactions from CF-MS data. This ensemble algorithm combines different results generated from individual classifiers within the ensemble via average to enhance prediction.

Usage

ensemble_model(
  features,
  gd,
  classifier = c("glm", "svmRadial", "ranger"),
  cv_fold = 2,
  verboseIter = TRUE,
  plots = FALSE,
  filename = file.path(tempdir(), "plots.pdf")
)

Arguments

`features`	A data frame with protein-protein associations in the first column, and features to be passed to the classifier in the remaining columns.
`gd`	A gold reference set including true associations with class labels indicating if such PPIs are positive or negative.
`classifier`	The type of classifier to use. See `caret` for the available classifiers.
`cv_fold`	Number of partitions for cross-validation; defaults to 5.
`verboseIter`	Logical value, indicating whether to check the status of training process;defaults to FALSE.
`plots`	Logical value, indicating whether to plot the performance of ensemble learning algorithm as compared to individual classifiers; defaults to FALSE.If the argument set to TRUE, plots will be saved in the current working directory. These plots are : pr_plot - Precision-recall plot of ensemble classifier vs selected individual classifiers. roc_plot - ROC plot of ensemble classifier vs selected individual classifiers. points_plot - Plot accuracy, F1-score ,positive predictive value (PPV),sensitivity (SE), and Matthews correlation coefficient (MCC) of ensemble classifier vs selected individual classifiers. .
`filename`	character string, indicating the location and output pdf filename for for performance plots. Defaults to tempdir().

Details

ensemble_model

Value

Ensemble_training_output

prediction score - Prediction scores for whole dataset from each individual classifier.
Best - Selected hyper parameters.
Parameter range - Tested hyper parameters.
prediction_score_test - Scores probabilities for test data from each individual classifier.
class_label - Class probabilities for test data from each individual classifier.

classifier_performance