find_central_elements_by_cluster: Encapsulation of steps to create clusters and determine most...
In Benjamin-Vincent-Lab/binfotron: Binfotron Bioinformatics Analysis Tools Suite

find_central_elements_by_cluster

R Documentation

Encapsulation of steps to create clusters and determine most central elements of each cluster

Description

Generate clusters using kmeans method, and determine most representative element for each cluster using a pca analysis (most central feature in pca space) , mhorn similarity index (most similar feature), or pearson/spearman correlation (most correlated feature).

Usage

find_central_elements_by_cluster(
  feature_df,
  anno_mark_font_size = 8,
  annotate_central_elements = T,
  annotate_central_elements_n_clusters = 40,
  central_element_circle_radius = 1/10,
  centrality_methods = "by-rank",
  cluster_id_width = NA,
  cluster_plot_sizes = NA,
  dist_method = "euclidean",
  file_prefix = "central_elements",
  grid_size = 100,
  grid_units = "mm",
  hclust_method = "complete",
  max_clusters = 40,
  max_depth = NA,
  min_clusters = 1L,
  my_threads = 1,
  my_seed = NA,
  output_central_elements = T,
  output_cumulative_variance = F,
  output_dir = ".",
  output_gmt = T,
  output_heatmap = F,
  output_pc1_vs_pc2 = F,
  output_ranked_central_elements = T,
  rank_clm = "Rank",
  rank_df = NULL,
  row_annotation_lwd = 0.25,
  row_annotation_width = 15,
  row_annotation_width_units = "mm",
  row_dend_lwd = 0.25,
  row_dend_width = 15,
  row_dend_width_units = "mm",
  hm_raster_quality = 5,
  show_hm_row_names = T
)

Arguments

`feature_df`	data.frame on which to perform PCA, mhorn or spearman analysis and kmeans clustering. Importantly: Rows must be named after features.
`centrality_methods`	A character vector with strings specifying the method for selecting the most central feature of a cluster: two-in-a-row - using PCA, selects the feature that shows up two times in a row as we calculate sum of squares adding more and more PC's is selected max-depth - using PCA, selects the feature with the maximum sum of squares calculated across the number of pc's requested as the "max_depth" first-most-frequent - using PCA, determines the max sum of squares for 2 pcs, 3 pcs, 4 pcs ... up to N pc's and then picks the feature that showed up the most times across all those calculations mhorn - feature most similar to others (ie, largest sum to all other elements) wins spearman - feature most similar to others (ie, largest sum to all other elements) wins pearson - feature most similar to others (ie, largest sum to all other elements) wins by-rank - defaults to the most significant according to `rank_df`
`cluster_id_width`	An integer indicating how many characters to use for cluster group and cluster number id's. Defaults to one more than the number of characters in `max_clusters`.
`cluster_plot_sizes`	Integer vector indicating which cluster groups to save as plots with clusters circled and central elements labeled. Only used if `centrality_methods` is one of the pca options.
`dist_method`	String indicating the method to pass to `stats::dist` method for clustering
`file_prefix`	The text to be prepended to the file names for tables and plots
`grid_size`	Number to specify the size of the heatmap
`grid_units`	Number to specify the units corresponding to `grid_size` of the heatmap
`hclust_method`	String indicating the method to pass to `stats::hclust` method for clustering
`max_clusters`	Integer indicating the maximum number of clusters to split data into
`max_depth`	Integer indicating the maximum depth across principle components to use for determining most central element
`min_clusters`	Integer indicating the minimum number of clusters to split data into
`my_threads`	Integer value specifying to number of parallel processes to use when calculating mhorn indices. Defaults to 1.
`my_seed`	The seed key to use so clustering can be reproduced
`output_central_elements`	Boolean whether or not to save the table of central elements by cluster group
`output_cumulative_variance`	Boolean whether to save a plot of the cumulative variance explained by the pca axes. Only used if `centrality_methods` is one of the pca options.
`output_dir`	The base directory to which files and plots will be saved
`output_gmt`	Boolean whether or not to save the gmt data to file
`output_heatmap`	Boolean whether to save correlation heatmap to file. Ignored if `centrality_methods` is one of the PCA options.
`output_ranked_central_elements`	Boolean whether to save the table of unique central elements sorted by rank within cluster group
`rank_clm`	One-length character vector with the name of the column holding the initial rankings, if any, in either rank_df if one was sent, or in `feature_df` otherwise
`rank_df`	Data.frame with `feature_df` features by row in column one and `rank_clm` with numeric default ranking for tie-breaking. If `<NA>` `rank_clm` will be looked for in `feature_df`.
`output_pc1_vs_pv2`	Boolean whether to save a plot of the principle component 1 and 2 axes. Only used if `centrality_methods` is one of the pca options.

Value

Returns 3 variable list with cluster_members, seed, and results. Results is a named list of each centrality_methods with central_elements and either pca or correlations ( depending on the centrality_methods )

Benjamin-Vincent-Lab/binfotron documentation built on April 11, 2025, 10:05 p.m.

Benjamin-Vincent-Lab/binfotron index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

Benjamin-Vincent-Lab/binfotron
Binfotron Bioinformatics Analysis Tools Suite

find_central_elements_by_cluster: Encapsulation of steps to create clusters and determine most...
In Benjamin-Vincent-Lab/binfotron: Binfotron Bioinformatics Analysis Tools Suite

Encapsulation of steps to create clusters and determine most central elements of each cluster

Description

Usage

Arguments

Value

Related to find_central_elements_by_cluster in Benjamin-Vincent-Lab/binfotron...

R Package Documentation

Browse R Packages

We want your feedback!

Benjamin-Vincent-Lab/binfotron Binfotron Bioinformatics Analysis Tools Suite

find_central_elements_by_cluster: Encapsulation of steps to create clusters and determine most... In Benjamin-Vincent-Lab/binfotron: Binfotron Bioinformatics Analysis Tools Suite

Encapsulation of steps to create clusters and determine most central elements of each cluster

Description

Usage

Arguments

Value

Related to find_central_elements_by_cluster in Benjamin-Vincent-Lab/binfotron...

R Package Documentation

Browse R Packages

We want your feedback!

Benjamin-Vincent-Lab/binfotron
Binfotron Bioinformatics Analysis Tools Suite

find_central_elements_by_cluster: Encapsulation of steps to create clusters and determine most...
In Benjamin-Vincent-Lab/binfotron: Binfotron Bioinformatics Analysis Tools Suite