pick_best_cluster_overall: Pick the Best Clustering Result Based on Multiple Metrics
In immunaut: Machine Learning Immunogenicity and Vaccine Response Analysis

pick_best_cluster_overall

R Documentation

Pick the Best Clustering Result Based on Multiple Metrics

Description

This function evaluates multiple clustering results based on various metrics such as modularity, silhouette score, Davies-Bouldin Index (DBI), and Calinski-Harabasz Index (CH). It normalizes the scores across all metrics, calculates a combined score for each clustering result, and selects the best clustering result.

Usage

pick_best_cluster_overall(tsne_clust, tsne_calc)

Arguments

`tsne_clust`	A list of clustering results. Each result should contain metrics such as modularity, silhouette score, and cluster assignments for the dataset.
`tsne_calc`	A list containing the t-SNE results. It includes the t-SNE coordinates of the dataset used for clustering.

Details

The function computes four different metrics for each clustering result:

Modularity: A measure of the quality of the division of the network into clusters.
Silhouette score: A measure of how similar data points are to their own cluster compared to other clusters.
Davies-Bouldin Index (DBI): A ratio of within-cluster distances to between-cluster distances, with lower values being better.
Calinski-Harabasz Index (CH): The ratio of the sum of between-cluster dispersion to within-cluster dispersion, with higher values being better.

The scores for each metric are normalized between 0 and 1, and an overall score is calculated for each clustering result. The clustering result with the highest overall score is selected as the best.

Value

The clustering result with the highest combined score based on modularity, silhouette score, Davies-Bouldin Index (DBI), and Calinski-Harabasz Index (CH).

immunaut documentation built on April 12, 2025, 1:22 a.m.