cluster_candidate_motifs: Cluster Candidate Motifs

View source: R/cluster_candidate_motifs.R

cluster_candidate_motifsR Documentation

Cluster Candidate Motifs

Description

This function clusters candidate motifs based on their distances and computes group-specific radii for motif clusters. It utilizes K-nearest neighbors (KNN) for determining a global radius and evaluates overlaps among motifs. The function supports parallel computation for efficiency.

Usage

cluster_candidate_motifs(
  filter_candidate_motifs_results,
  motif_overlap = 0.6,
  k_knn = 3,
  votes_knn_Rall = 0.5,
  votes_knn_Rm = 0.5,
  worker_number = NULL
)

Arguments

filter_candidate_motifs_results

A list containing results from filtering candidate motifs, including various components like 'Y0', 'Y1', 'V0_clean', 'V1_clean', 'D_clean', and more, which are essential for the clustering process.

motif_overlap

A numeric value representing the minimum proportion of overlap required between motifs to be considered similar (default is 0.6).

k_knn

An integer specifying the number of nearest neighbors to consider when determining the global radius (default is 3).

votes_knn_Rall

A numeric value indicating the threshold for KNN voting when determining the global radius (default is 0.5).

votes_knn_Rm

A numeric value indicating the threshold for KNN voting when determining group-specific radii (default is 0.5).

worker_number

An optional integer specifying the number of parallel workers to use. If NULL, it defaults to the number of available cores minus one.

Details

This function performs the following steps: 1. Sets up parallel jobs based on the specified 'worker_number'. 2. Prepares input data based on the type of distance measure used. 3. Computes distances between motifs. 4. Determines a global radius ('R_all') using KNN classification. 5. Clusters motifs and determines group-specific radii ('R_m') for each cluster.

Value

A list containing: - 'VV_D': Matrix of distances between motifs. - 'VV_S': Matrix of shifts between motifs. - 'k_knn': The value of K used in KNN. - 'votes_knn_Rall': Voting threshold for the global radius. - 'R_all': The global radius determined from the clustering process. - 'hclust_res': Result of hierarchical clustering (if applicable). - 'votes_knn_Rm': Voting threshold for group-specific radius. - 'R_m': Vector of group-specific radii for each cluster. - All components from the input 'filter_candidate_motifs_results'.


funMoDisco documentation built on April 16, 2025, 1:10 a.m.