Tomato | R Documentation |
This clustering algorithm needs a neighborhood graph on the points, and an estimation of the density at each point. A few possible graph constructions and density estimators are provided for convenience, but it is perfectly natural to provide your own.
rgudhi::PythonClass
-> Tomato
new()
The Tomato
constructor.
Tomato$new( graph_type = c("knn", "radius", "manual"), density_type = c("logDTM", "DTM", "logKDE", "KDE", "manual"), n_clusters = NULL, merge_threshold = NULL, ... )
graph_type
A string specifying the method to compute the
neighboring graph. Choices are "knn"
, "radius"
or "manual"
.
Defaults to "knn"
.
density_type
A string specifying the choice of density estimator.
Choicea are "logDTM"
, "DTM"
, "logKDE"
or "manual"
. When you
have many points, "KDE"
and "logKDE"
tend to be slower. Defaults to
"logDTM"
n_clusters
An integer value specifying the number of clusters.
Defaults to NULL
, i.e. no merging occurs and we get the maximal
number of clusters.
merge_threshold
A numeric value specifying the minimum prominence
of a cluster so it doesn’t get merged. Defaults to NULL
, i.e. no
merging occurs and we get the maximal number of clusters.
...
Extra parameters passed to KNearestNeighbors
and
DTMDensity
.
An object of class Tomato
.
fit()
Runs the Tomato algorithm on the provided data.
Tomato$fit(X, y = NULL, weights = NULL)
X
Either a numeric matrix specifying the coordinates (in column)
of each point (in row) or a full distance matrix if metric == "precomputed"
or a list of neighbors for each point if graph_type == "manual"
. The number of points is currently limited to about 2
billion.
y
Not used, present here for API consistency with scikit-learn by convention.
weights
A numeric vector specifying a density estimate at each
point. Used only if density_type == "manual"
.
The updated Tomato
class itself invisibly.
fit_predict()
Runs the Tomato algorithm on the provided data and returns the class memberships.
Tomato$fit_predict(X, y = NULL, weights = NULL)
X
Either a numeric matrix specifying the coordinates (in column)
of each point (in row) or a full distance matrix if metric == "precomputed"
or a list of neighbors for each point if graph_type == "manual"
. The number of points is currently limited to about 2
billion.
y
Not used, present here for API consistency with scikit-learn by convention.
weights
A numeric vector specifying a density estimate at each
point. Used only if density_type == "manual"
.
An integer vector storing the class memberships.
set_n_clusters()
Sets the number of clusters which automatically adjusts class memberships.
Tomato$set_n_clusters(n_clusters)
n_clusters
An integer value specifying the number of clusters.
The updated Tomato
class itself invisibly.
get_n_clusters()
Gets the number of clusters.
Tomato$get_n_clusters()
The number of clusters.
set_merge_threshold()
Sets the threshold for merging clusters which automatically adjusts class memberships.
Tomato$set_merge_threshold(merge_threshold)
merge_threshold
A numeric value specifying the threshold for merging clusters.
The updated Tomato
class itself invisibly.
get_merge_threshold()
Gets the threshold for merging clusters.
Tomato$get_merge_threshold()
The threshold for merging clusters.
get_labels()
Gets the class memberships.
Tomato$get_labels()
An integer vector storing the class memberships.
plot_diagram()
Computes the persistence diagram of the merge tree of the initial clusters. This is a convenient graphical tool to help decide how many clusters we want.
Tomato$plot_diagram()
clone()
The objects of this class are cloneable with this method.
Tomato$clone(deep = FALSE)
deep
Whether to make a deep clone.
Marc Glisse
X <- seq_circle(100)
cl <- Tomato$new()
cl$fit_predict(X)
cl$set_n_clusters(2)
cl$get_labels()
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.