cluster_tsne_density: Perform Density-Based Clustering on t-SNE Results Using...

View source: R/functions.R

cluster_tsne_densityR Documentation

Perform Density-Based Clustering on t-SNE Results Using DBSCAN

Description

This function applies Density-Based Spatial Clustering of Applications with Noise (DBSCAN) on t-SNE results to identify clusters and detect noise points. It dynamically calculates the MinPts and eps parameters based on the t-SNE results and settings provided. Additionally, the function computes silhouette scores to evaluate cluster quality and returns cluster centroids along with cluster sizes.

Usage

cluster_tsne_density(info.norm, tsne.norm, settings)

Arguments

info.norm

A data frame containing the normalized data on which the t-SNE analysis was carried out.

tsne.norm

The t-SNE results object, including the 2D t-SNE coordinates (Y matrix).

settings

A list of settings for the DBSCAN clustering. These settings include:

  • minPtsAdjustmentFactor: A factor to adjust the minimum number of points required to form a cluster (MinPts).

  • epsQuantile: The quantile used to determine the eps value for DBSCAN.

Details

The function first calculates MinPts based on the dimensionality of the t-SNE data and adjusts it using the provided minPtsAdjustmentFactor. The eps value is determined dynamically from the k-nearest neighbors distance using the quantile specified by epsQuantile. DBSCAN is then applied to the t-SNE data, and any NA values in the cluster assignments are replaced with a predefined outlier cluster ID (100). Finally, the function calculates cluster centroids, sizes, and silhouette scores to evaluate cluster separation and quality.

Value

A list containing:

  • info.norm: The input data frame with an additional pandora_cluster column for cluster assignments.

  • cluster_data: A data frame with cluster centroids and labeled clusters.

  • avg_silhouette_score: The average silhouette score, providing a measure of clustering quality.


immunaut documentation built on April 12, 2025, 1:22 a.m.