Tomato: Clustering: Tomato

TomatoR Documentation

Clustering: Tomato

Description

This clustering algorithm needs a neighborhood graph on the points, and an estimation of the density at each point. A few possible graph constructions and density estimators are provided for convenience, but it is perfectly natural to provide your own.

Super class

rgudhi::PythonClass -> Tomato

Methods

Public methods

Inherited methods

Method new()

The Tomato constructor.

Usage
Tomato$new(
  graph_type = c("knn", "radius", "manual"),
  density_type = c("logDTM", "DTM", "logKDE", "KDE", "manual"),
  n_clusters = NULL,
  merge_threshold = NULL,
  ...
)
Arguments
graph_type

A string specifying the method to compute the neighboring graph. Choices are "knn", "radius" or "manual". Defaults to "knn".

density_type

A string specifying the choice of density estimator. Choicea are "logDTM", "DTM", "logKDE" or "manual". When you have many points, "KDE" and "logKDE" tend to be slower. Defaults to "logDTM"

n_clusters

An integer value specifying the number of clusters. Defaults to NULL, i.e. no merging occurs and we get the maximal number of clusters.

merge_threshold

A numeric value specifying the minimum prominence of a cluster so it doesn’t get merged. Defaults to NULL, i.e. no merging occurs and we get the maximal number of clusters.

...

Extra parameters passed to KNearestNeighbors and DTMDensity.

Returns

An object of class Tomato.


Method fit()

Runs the Tomato algorithm on the provided data.

Usage
Tomato$fit(X, y = NULL, weights = NULL)
Arguments
X

Either a numeric matrix specifying the coordinates (in column) of each point (in row) or a full distance matrix if metric == "precomputed" or a list of neighbors for each point if graph_type == "manual". The number of points is currently limited to about 2 billion.

y

Not used, present here for API consistency with scikit-learn by convention.

weights

A numeric vector specifying a density estimate at each point. Used only if density_type == "manual".

Returns

The updated Tomato class itself invisibly.


Method fit_predict()

Runs the Tomato algorithm on the provided data and returns the class memberships.

Usage
Tomato$fit_predict(X, y = NULL, weights = NULL)
Arguments
X

Either a numeric matrix specifying the coordinates (in column) of each point (in row) or a full distance matrix if metric == "precomputed" or a list of neighbors for each point if graph_type == "manual". The number of points is currently limited to about 2 billion.

y

Not used, present here for API consistency with scikit-learn by convention.

weights

A numeric vector specifying a density estimate at each point. Used only if density_type == "manual".

Returns

An integer vector storing the class memberships.


Method set_n_clusters()

Sets the number of clusters which automatically adjusts class memberships.

Usage
Tomato$set_n_clusters(n_clusters)
Arguments
n_clusters

An integer value specifying the number of clusters.

Returns

The updated Tomato class itself invisibly.


Method get_n_clusters()

Gets the number of clusters.

Usage
Tomato$get_n_clusters()
Returns

The number of clusters.


Method set_merge_threshold()

Sets the threshold for merging clusters which automatically adjusts class memberships.

Usage
Tomato$set_merge_threshold(merge_threshold)
Arguments
merge_threshold

A numeric value specifying the threshold for merging clusters.

Returns

The updated Tomato class itself invisibly.


Method get_merge_threshold()

Gets the threshold for merging clusters.

Usage
Tomato$get_merge_threshold()
Returns

The threshold for merging clusters.


Method get_labels()

Gets the class memberships.

Usage
Tomato$get_labels()
Returns

An integer vector storing the class memberships.


Method plot_diagram()

Computes the persistence diagram of the merge tree of the initial clusters. This is a convenient graphical tool to help decide how many clusters we want.

Usage
Tomato$plot_diagram()

Method clone()

The objects of this class are cloneable with this method.

Usage
Tomato$clone(deep = FALSE)
Arguments
deep

Whether to make a deep clone.

Author(s)

Marc Glisse

Examples


X <- seq_circle(100)
cl <- Tomato$new()
cl$fit_predict(X)
cl$set_n_clusters(2)
cl$get_labels()


rgudhi documentation built on March 31, 2023, 11:38 p.m.