cluster_split | R Documentation |
Split data based on clusters
cluster_split(
data,
cluster_method = "hierarchical",
split_distance = NULL,
n_kmeans = NULL
)
data |
data.frame of occurrence records containing at least longitude and latitude columns. |
cluster_method |
(character) name of the method to be used for clustering the occurrences. Options are "hierarchical" and "k-means"; default = "hierarchical". |
split_distance |
(numeric) distance in km that will be considered as the
limit of connectivity among polygons created with clusters of occurrences.
This parameter is used when |
n_kmeans |
(numeric) if |
The cluster_method
must be chosen based on the spatial
configuration of the species occurrences. Both methods make distinct assumptions
and one of them may perform better than the other depending on the spatial
pattern of the data.
The k-means method, for example, perfomrs better when the following assumptions are fulfilled: Clusters are spatially grouped—or “spherical” and Clusters are of a similar size. Owing to the nature of the hierarchical clustering algorithm it may take more time than the k-means method. Both methods make assumptions and they may work well on some data sets, and fail on others.
Another important factor to consider is that the k-means method allways starts with a random choice of cluster centers, thus it may end in different results on different runs. That may be problematic when trying to replicate your methods. With hierarchical clustering, most likely the same clusters can be obtained if the process is repeated.
For more information on these clustering methods see Aggarwal and Reddy (2014) https://goo.gl/RQ2ebd.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.