View source: R/find_central_clone.R
find_central_elements_by_cluster | R Documentation |
Generate clusters using kmeans method, and determine most representative element for each cluster using a pca analysis (most central feature in pca space) , mhorn similarity index (most similar feature), or pearson/spearman correlation (most correlated feature).
find_central_elements_by_cluster(
feature_df,
anno_mark_font_size = 8,
annotate_central_elements = T,
annotate_central_elements_n_clusters = 40,
central_element_circle_radius = 1/10,
centrality_methods = "by-rank",
cluster_id_width = NA,
cluster_plot_sizes = NA,
dist_method = "euclidean",
file_prefix = "central_elements",
grid_size = 100,
grid_units = "mm",
hclust_method = "complete",
max_clusters = 40,
max_depth = NA,
min_clusters = 1L,
my_threads = 1,
my_seed = NA,
output_central_elements = T,
output_cumulative_variance = F,
output_dir = ".",
output_gmt = T,
output_heatmap = F,
output_pc1_vs_pc2 = F,
output_ranked_central_elements = T,
rank_clm = "Rank",
rank_df = NULL,
row_annotation_lwd = 0.25,
row_annotation_width = 15,
row_annotation_width_units = "mm",
row_dend_lwd = 0.25,
row_dend_width = 15,
row_dend_width_units = "mm",
hm_raster_quality = 5,
show_hm_row_names = T
)
feature_df |
data.frame on which to perform PCA, mhorn or spearman analysis and kmeans clustering. Importantly: Rows must be named after features. |
centrality_methods |
A character vector with strings specifying the method for selecting the most central feature of a cluster:
|
cluster_id_width |
An integer indicating how many characters to use for cluster group and cluster number id's. Defaults to one more than the number of characters in |
cluster_plot_sizes |
Integer vector indicating which cluster groups to save as plots with clusters circled and central elements labeled. Only used if |
dist_method |
String indicating the method to pass to |
file_prefix |
The text to be prepended to the file names for tables and plots |
grid_size |
Number to specify the size of the heatmap |
grid_units |
Number to specify the units corresponding to |
hclust_method |
String indicating the method to pass to |
max_clusters |
Integer indicating the maximum number of clusters to split data into |
max_depth |
Integer indicating the maximum depth across principle components to use for determining most central element |
min_clusters |
Integer indicating the minimum number of clusters to split data into |
my_threads |
Integer value specifying to number of parallel processes to use when calculating mhorn indices. Defaults to 1. |
my_seed |
The seed key to use so clustering can be reproduced |
output_central_elements |
Boolean whether or not to save the table of central elements by cluster group |
output_cumulative_variance |
Boolean whether to save a plot of the cumulative variance explained by the pca axes. Only used if |
output_dir |
The base directory to which files and plots will be saved |
output_gmt |
Boolean whether or not to save the gmt data to file |
output_heatmap |
Boolean whether to save correlation heatmap to file. Ignored if |
output_ranked_central_elements |
Boolean whether to save the table of unique central elements sorted by rank within cluster group |
rank_clm |
One-length character vector with the name of the column holding the initial rankings, if any, in either rank_df if one was sent, or in |
rank_df |
Data.frame with |
output_pc1_vs_pv2 |
Boolean whether to save a plot of the principle component 1 and 2 axes. Only used if |
Returns 3 variable list with cluster_members
, seed, and results. Results is a named list of each centrality_methods
with central_elements
and either pca or correlations ( depending on the centrality_methods
)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.