R/RcppExports.R

Defines functions fastclarans fastclara fastpam pam

Documented in fastclara fastclarans fastpam pam

# Generated by using Rcpp::compileAttributes() -> do not edit by hand
# Generator token: 10BE3573-1514-4C36-9D1C-5A225CD40393

#' PAM (Partitioning Around Medoids)
#' 
#' @description  The original Partitioning Around Medoids (PAM) algorithm or k-medoids
#' clustering, as proposed by Kaufman and Rousseeuw; a largely equivalent method
#' was also proposed by Whitaker in the operations research domain, and is well
#' known by the name "fast interchange" there.
#' (Schubert and Rousseeuw, 2019)
#' 
#' @references L. Kaufman, P. J. Rousseeuw
#' "Clustering by means of Medoids"
#' Information Systems and Operational Research 21(2)
#' 
#' @param rdist The distance matrix (lower triangular matrix, column wise storage)
#' @param n The number of observations
#' @param k The number of clusters to produce
#' @param maxiter The maximum number of iterations (default: 0)
#' @return KMedoids S4 class
#' @export
pam <- function(rdist, n, k, maxiter = 0L) {
    .Call(`_fastkmedoids_pam`, rdist, n, k, maxiter)
}

#' FastPAM
#' 
#' @description FastPAM: An improved version of PAM, that is usually O(k) times faster.
#' Because of the speed benefits, we also suggest to use a linear-time
#' initialization, such as the k-means++ initialization or the proposed
#' LAB (linear approximative BUILD, the third component of FastPAM)
#' initialization, and try multiple times if the runtime permits.
#' (Schubert and Rousseeuw, 2019)
#' 
#' @references Erich Schubert, Peter J. Rousseeuw 
#' "Faster k-Medoids Clustering: Improving the PAM, CLARA, and CLARANS Algorithms"
#' 2019 https://arxiv.org/abs/1810.05691
#' @param rdist The distance matrix (lower triangular matrix, column wise storage)
#' @param n The number of observations
#' @param k The number of clusters to produce.
#' @param maxiter The maximum number of iterations (default: 0)
#' @param initializer Initializer: either "BUILD" (used in classic PAM) or "LAB" (linear approximative BUILD)
#' Because of the speed benefits, "LAB" is suggested, and one can try multiple times if the runtime permits.
#' @param fasttol Tolerance for fast swapping behavior (may perform worse swaps). 
#' Default: 1.0, which means to perform any additional swap that gives an improvement.
#' When set to 0, it will only execute an additional swap if it appears to be independent
#' (i.e., the improvements resulting from the swap have not decreased when the first swap was executed).
#' @param seed Seed for random number generator. Default: 123456789
#' @return KMedoids S4 class
#' @export
fastpam <- function(rdist, n, k, maxiter = 0L, initializer = "LAB", fasttol = 1.0, seed = 123456789L) {
    .Call(`_fastkmedoids_fastpam`, rdist, n, k, maxiter, initializer, fasttol, seed)
}

#' FastCLARA
#' 
#' @description Clustering Large Applications (CLARA) with the
#'  improvements, to increase scalability in the number of clusters. This variant
#'  will also default to twice the sample size, to improve quality. 
#'  (Schubert and Rousseeuw, 2019)
#'  
#' @references Erich Schubert, Peter J. Rousseeuw 
#' "Faster k-Medoids Clustering: Improving the PAM, CLARA, and CLARANS Algorithms"
#' 2019 https://arxiv.org/abs/1810.05691
#' 
#' @param rdist The distance matrix (lower triangular matrix, column wise storage)
#' @param n The number of observations
#' @param k The number of clusters to produce
#' @param maxiter The maximum number of iterations (default: 0)
#' @param initializer Initializer: either "BUILD" (used in classic PAM) or "LAB" (linear approximative BUILD)
#' @param fasttol Tolerance for fast swapping behavior (may perform worse swaps). 
#' Default: 1.0, which means to perform any additional swap that gives an improvement.
#' When set to 0, it will only execute an additional swap if it appears to be independent
#' (i.e., the improvements resulting from the swap have not decreased when the first swap was executed).
#' @param numsamples Number of samples to draw (i.e. iterations). Default: 5
#' @param sampling Sampling rate. Default value: 80 + 4*k. (see Schubert and Rousseeuw, 2019)
#'   If less than 1, it is considered to be a relative value. e.g. N*0.10
#' @param independent NOT Keep the previous medoids in the next sample. Default: FALSE
#' @param seed Seed for random number generator. Default: 123456789
#' @return KMedoids S4 class
#' @export
fastclara <- function(rdist, n, k, maxiter = 0L, initializer = "LAB", fasttol = 1.0, numsamples = 5L, sampling = 0.25, independent = FALSE, seed = 123456789L) {
    .Call(`_fastkmedoids_fastclara`, rdist, n, k, maxiter, initializer, fasttol, numsamples, sampling, independent, seed)
}

#' FastCLARANS
#' 
#' @description A faster variation of CLARANS, that can explore O(k) as many swaps at a
#'  similar cost by considering all medoids for each candidate non-medoid. Since
#'  this means sampling fewer non-medoids, we suggest to increase the subsampling
#'  rate slightly to get higher quality than CLARANS, at better runtime. 
#'  (Schubert and Rousseeuw, 2019)
#'  
#' @references Erich Schubert, Peter J. Rousseeuw 
#' "Faster k-Medoids Clustering: Improving the PAM, CLARA, and CLARANS Algorithms"
#' 2019 https://arxiv.org/abs/1810.05691
#' 
#' @param rdist The distance matrix (lower triangular matrix, column wise storage)
#' @param n The number of observations
#' @param k The number of clusters to produce.
#' @param numlocal  Number of samples to draw (i.e. restarts).
#'   Default: 2
#' @param maxneighbor Sampling rate. If less than 1, it is considered to be a relative value.
#'   Default: 2 * 0.0125, larger sampling rate than CLARANS (see Schubert and Rousseeuw, 2019)
#' @param seed Seed for random number generator. Default: 123456789
#' @return KMedoids S4 class
#' @export
fastclarans <- function(rdist, n, k, numlocal = 2L, maxneighbor = 0.025, seed = 123456789L) {
    .Call(`_fastkmedoids_fastclarans`, rdist, n, k, numlocal, maxneighbor, seed)
}

Try the fastkmedoids package in your browser

Any scripts or data that you put into this service are public.

fastkmedoids documentation built on Jan. 22, 2021, 1:06 a.m.