View source: R/machine_learning.R
| diagram_kkmeans | R Documentation | 
Finds latent cluster labels for a group of persistence diagrams, using a kernelized version of the popular k-means algorithm. An optimal number of clusters may be determined by analyzing the withinss field of the clustering object over several values of k.
diagram_kkmeans(
  diagrams,
  K = NULL,
  centers,
  dim = 0,
  t = 1,
  sigma = 1,
  rho = NULL,
  num_workers = parallelly::availableCores(omit = 1),
  ...
)
diagrams | 
 a list of n>=2 persistence diagrams which are either the output of a persistent homology calculation like ripsDiag/  | 
K | 
 an optional precomputed Gram matrix of persistence diagrams, default NULL.  | 
centers | 
 number of clusters to initialize, no more than the number of diagrams although smaller values are recommended.  | 
dim | 
 the non-negative integer homological dimension in which the distance is to be computed, default 0.  | 
t | 
 a positive number representing the scale for the persistence Fisher kernel, default 1.  | 
sigma | 
 a positive number representing the bandwidth for the Fisher information metric, default 1.  | 
rho | 
 an optional positive number representing the heuristic for Fisher information metric approximation, see   | 
num_workers | 
 the number of cores used for parallel computation, default is one less than the number of cores on the machine.  | 
... | 
 additional parameters for the   | 
Returns the output of kkmeans on the desired Gram matrix of a group of persistence diagrams
in a particular dimension. The additional list elements stored in the output are needed
to estimate cluster labels for new persistence diagrams in the 'predict_diagram_kkmeans'
function.
a list of class 'diagram_kkmeans' containing the output of kkmeans on the Gram matrix, i.e. a list containing the elements
an S4 object of class specc, the output of a kkmeans function call. The '.Data' slot of this object contains cluster memberships, 'withinss' contains the within-cluster sum of squares for each cluster, etc.
the input 'diagrams' argument.
the input 'dim' argument.
the input 't' argument.
the input 'sigma' argument.
Shael Brown - shaelebrown@gmail.com
Dhillon, I and Guan, Y and Kulis, B (2004). "A Unified View of Kernel k-means , Spectral Clustering and Graph Cuts." https://people.bu.edu/bkulis/pubs/spectral_techreport.pdf.
predict_diagram_kkmeans for predicting cluster labels of new diagrams.
if(require("TDAstats"))
{
  # create two diagrams
  D1 <- TDAstats::calculate_homology(TDAstats::circle2d[sample(1:100,20),],
                      dim = 1,threshold = 2)
  D2 <- TDAstats::calculate_homology(TDAstats::circle2d[sample(1:100,20),],
                      dim = 1,threshold = 2)
  g <- list(D1,D1,D2,D2)
  # calculate kmeans clusters with centers = 2, and sigma = t = 2 in dimension 0
  clust <- diagram_kkmeans(diagrams = g,centers = 2,dim = 0,t = 2,sigma = 2,num_workers = 2)
  
  # repeat with precomputed Gram matrix, gives the same result just much faster
  K <- gram_matrix(diagrams = g,num_workers = 2,t = 2,sigma = 2)
  cluster <- diagram_kkmeans(diagrams = g,K = K,centers = 2,dim = 0,sigma = 2,t = 2)
  
}
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.