scclust-package | R Documentation |
The scclust
package is an R wrapper for the scclust
library.
The package provides functions to construct near-optimal size-constrained
clusterings. Subject to user-specified constraints on the size and composition
of the clusters, scclust
constructs a clustering so that within-cluster
pair-wise distances are minimized.
The main clustering function is sc_clustering
. Statistics about
clusters can be derived with the get_clustering_stats
function. To check if a clustering satisfies some set of
constraints, use check_clustering
. Use scclust
to
construct a scclust
object from an existing clustering.
Clusters can also be constructed with hierarchical_clustering
.
However, this function does not support type constraints and does not provide
optimality guarantees. Its main use is to refine clusterings constructed with
the sc_clustering
function.
scclust
was made with large data sets in mind, and it can cluster tens
of millions of data points within minutes on an ordinary desktop computer.
See the package's website for more information: https://github.com/fsavje/scclust-R.
More information about the scclust
library is found here:
https://github.com/fsavje/scclust.
Bug reports and suggestions are greatly appreciated. They are best reported here: https://github.com/fsavje/scclust-R/issues.
Higgins, Michael J., Fredrik Sävje and Jasjeet S. Sekhon (2016), ‘Improving massive experiments with threshold blocking’, Proceedings of the National Academy of Sciences, 113:27, 7369–7376.
Sävje, Fredrik and Michael J. Higgins and Jasjeet S. Sekhon (2017), ‘Generalized Full Matching’, arXiv 1703.03882. https://arxiv.org/abs/1703.03882
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.