isolate_central_cluster_elements: Determine most representative element in each cluster based...

View source: R/find_central_clone.R

isolate_central_cluster_elementsR Documentation

Determine most representative element in each cluster based on data provided

Description

Takes pca scores or a correlation matrix ( as data.frame ) and uses it to determine the most representative element from each of a list of clusters. Clusters with 2 elements use ranks sent or random selection to determine centrality while clusters larger than 3 use the centrality_method

Usage

isolate_central_cluster_elements(
  elements_data,
  cluster_members,
  element_ranks = NA,
  max_depth = NA,
  centrality_method = "max-depth"
)

Arguments

elements_data

A data.frame or matrix containing either elements x principle components ( as scores/x from PCA Analysis ) OR similarity scores of elements x elements

cluster_members

A named list of clusters with their elements

element_ranks

A named integer vector indicating the initial element rankings to be used for selection of best elements in clusters of length 2

max_depth

An integer indicating the maximum number of Principle Components to use in determining best elements. Only used for PCA type centrality_methods

centrality_method

A character vector with strings specifying the method for selecting the most central feature of a cluster:

  • two-in-a-row - using PCA, selects the feature that shows up two times in a row as we calculate sum of squares adding more and more PC's is selected

  • max-depth - using PCA, selects the feature with the maximum sum of squares calculated across the number of pc's requested as the "max_depth"

  • first-most-frequent - using PCA, determines the max sum of squares for 2 pcs, 3 pcs, 4 pcs ... up to N pc's and then picks the feature that showed up the most times across all those calculations

  • mhorn - feature most similar to others (ie, largest sum to all other elements) wins

  • spearman - feature most similar to others (ie, largest sum to all other elements) wins

  • pearson - feature most similar to others (ie, largest sum to all other elements) wins

  • by-rank - defaults to the most significant according to rank_df

Value

Returns original list of cluster_members with the most representative element for each named cluster


Benjamin-Vincent-Lab/binfotron documentation built on Oct. 1, 2024, 8:33 p.m.