OPTICS | R Documentation |
This is a wrapper around the Python class sklearn.cluster.OPTICS.
rgudhi::PythonClass
-> rgudhi::SKLearnClass
-> rgudhi::BaseClustering
-> OPTICS
new()
The OPTICS class constructor.
OPTICS$new( min_samples = 5L, max_eps = Inf, metric = c("minkowski", "cityblock", "cosine", "euclidean", "l1", "l2", "manhattan", "braycurtis", "canberra", "chebyshev", "correlation", "dice", "hamming", "jaccard", "kulsinski", "mahalanobis", "rogerstanimoto", "russellrao", "seuclidean", "sokalmichener", "sokalsneath", "sqeuclidean", "yule"), p = 2L, metric_params = NULL, cluster_method = c("xi", "dbscan"), eps = NULL, xi = 0.05, predecessor_correction = TRUE, min_cluster_size = NULL, algorithm = c("auto", "ball_tree", "kd_tree", "brute"), leaf_size = 30L, memory = NULL, n_jobs = 1L )
min_samples
Either an integer value greater than 1 or a numeric
value between 0 and 1 specifying the number of samples in a
neighborhood for a point to be considered as a core point. Also, up and
down steep regions can’t have more than min_samples
consecutive
non-steep points. Expressed as an absolute number or a fraction of the
number of samples (rounded to be at least 2). Defaults to 5L
.
max_eps
A numeric value specifying the maximum distance between
two samples for one to be considered as in the neighborhood of the
other. Reducing max_eps
will result in shorter run times. Defaults to
Inf
.
metric
Either a string or an object coercible into a function via
rlang::as_function()
specifying the metric to use for distance
computation. If metric
is a function, it is called on each pair of
instances (rows) and the resulting value recorded. The function should
take two numeric vectors as input and return one numeric value
indicating the distance between them. This works for Scipy’s metrics,
but is less efficient than passing the metric name as a string. If
metric is "precomputed"
, X
is assumed to be a distance matrix and
must be square. Valid string values for metric are:
from
sklearn.metrics:
"cityblock"
, "cosine"
, "euclidean"
, "l1"
, "l2"
,
"manhattan"
;
from
scipy.spatial.distance:
"braycurtis"
, "canberra"
, "chebyshev"
, "correlation"
, "dice"
,
"hamming"
, "jaccard"
, "kulsinski"
, "mahalanobis"
,
"minkowski"
, "rogerstanimoto"
, "russellrao"
, "seuclidean"
,
"sokalmichener"
, "sokalsneath"
, "sqeuclidean"
, "yule"
.
Defaults to "minkowski"
.
p
An integer value specifying the power for the Minkowski metric.
When p = 1
, this is equivalent to using the Manhattan distance
(\ell_1
). When p = 2
, this is equivalent to using the Euclidean
distance (\ell_2
). For arbitrary p
, the Minkowski distance
(\ell_p
) is used. Defaults to 2L
.
metric_params
A named list specifying additional arguments for the
metric function. Defaults to NULL
.
cluster_method
A string specifying the extraction method used to
extract clusters using the calculated reachability and ordering.
Possible values are "xi"
and "dbscan"
. Defaults to "xi"
.
eps
A numeric value specifying the maximum distance between two
samples for one to be considered as in the neighborhood of the other.
Defaults to max_eps
. Used only when cluster_method == "dbscan"
.
xi
A numeric value in [0,1]
specifying the minimum steepness
on the reachability plot that constitutes a cluster boundary. For
example, an upwards point in the reachability plot is defined by the
ratio from one point to its successor being at most 1 - xi
. Used only
when cluster_method == "xi"
. Defaults to 0.05
.
predecessor_correction
A boolean value specifying whether to
correct clusters according to the predecessors calculated by OPTICS
\insertCiteschubert2018improvingrgudhi. This parameter has minimal
effect on most data sets. Used only when cluster_method == "xi"
.
Defaults to TRUE
.
min_cluster_size
Either an integer value > 1
or a numeric
value in [0,1]
specifying the minimum number of samples in an
OPTICS cluster, expressed as an absolute number or a fraction of the
number of samples (rounded to be at least 2). If NULL
, the value of
min_samples
is used instead. Used only when cluster_method == "xi"
.
Defaults to NULL
.
algorithm
A string specifying the algorithm used to compute the
nearest neighbors. Choices are c("auto", "ball_tree", "kd_tree", "brute")
. Defaults to "auto"
which will attempt to decide the most
appropriate algorithm based on the values passed to fit method. Note:
fitting on sparse input will override the setting of this parameter,
using algorithm == "brute"
.
leaf_size
An integer value specifying the leaf size passed to
BallTree
or KDTree
. This can affect the speed of the construction
and query, as well as the memory required to store the tree. The
optimal value depends on the nature of the problem. Defaults to 30L
memory
A string specifying the path to the caching directory into
which caching the output of the computation of the tree. Defaults to
NULL
in which case no caching is done.
n_jobs
An integer value specifying the number of parallel jobs to
run for neighbors search. Defaults to 1L
. A value of -1L
means
using all processors.
An object of class OPTICS.
clone()
The objects of this class are cloneable with this method.
OPTICS$clone(deep = FALSE)
deep
Whether to make a deep clone.
cl <- OPTICS$new()
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.