Description Usage Arguments Value Examples
View source: R/generate_clusters.R
This function takes the normalized data (TPM/FPKM & feature scaled) and uses the k-means function to generate an iterative series of clusters to identify a potentially optimal number of clusters for the dataset. For reproducible clusters, it is highly recommended that a seed value is used prior to generating the clusters using the set.seed function.
1 | generate_clusters(df, kmin, kmax, ktot, num_iter, km_algo)
|
df |
A dataframe containing the normalized reads |
kmin |
An integer indicating the minimum number of clusters to generate. By default, this is set to 10. |
kmax |
An integer indicating the maximum number of clusters to generate. By default, this is set to 150. |
ktot |
An integer indicating how many unique k-values to generate. By default, this is set to 15. This produces 15 values ranging from kmin up to kmax. Increasing this number will significantly impact performance. |
num_iter |
An integer indicating the number or cluster iterations to generate. By default, this is set to 10. This will perform the same k-means clustering multiple times to account for the stochastic nature of the k-means algorithm, resulting in a mean quality value in the final step that is more reliable than a single iteration would be. Lowering this value will negatively affect the GECO quality assessment, raising it will impact performance. |
km_algo |
A string indicating which k-means algorithm to use. By default, this is set to 'Hartigan-Wong'. |
A list containing each iteration of the clustering performed. Within each of the iterations are the kmeans objects for use in the second step e.g. score_clusters(clusters).
1 2 3 4 5 |
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.