mset_pam: Generates Methods Settings for Partitioning Around Medoids...
In qcluster: Clustering via Quadratic Scoring

mset_pam

R Documentation

Generates Methods Settings for Partitioning Around Medoids (Pam) Clustering

Description

The function generates a software abstraction of a list of clustering models implemented through the a set of tuned methods and algorithms. In particular, it generates a list of pam-type functions each combining tuning parameters and other algorithmic settings. The generated functions are ready to be called on the data set.

Usage

mset_pam(K = seq(10),
         metric = "euclidean",
         medoids = if (is.numeric(nstart)) "random",
         nstart = if (variant == "faster") 1 else NA,
         stand = FALSE,
         do.swap = TRUE,
         variant = "original",
         pamonce = FALSE)

Arguments

`K`	a vector/list, specifies the number of clusters.
`metric`	a vector, contains the settings of the `metric` parameter of `pam`.
`medoids`	list, contains the settings of the `medoids` parameter of `pam`.
`nstart`	a vector, contains the settings of the `nstart` parameter of `pam`.
`stand`	a vector, contains the settings of the `stand` parameter of `pam`.
`do.swap`	a vector, contains the settings of the `do.swap` parameter of `pam`.
`variant`	a list, contains the settings of the `variant` parameter of `pam`.
`pamonce`	a vector, contains the settings of the `pamonce` parameter of `pam`.

Details

The function produces functions implementing competing clustering methods based on the PAM clustering methodology as implemented in pam. This is a specialized version of the more general function mset_user. In particular, it produces a list of pam functions each corresponding to a specific setup in terms of hyper-parameters (e.g. the number of clusters) and algorithm's control parameters (e.g. initialization). See pam for more detail for a detailed description of the role of each argument and their data types.

Value

An S3 object of class 'qcmethod'. Each element of the list represents a competing method containing the following objects

`fullname`	a string identifying the setup.
`callargs`	a list with `pam` function arguments.
`fn`	the function implementing the specified setting. This `fn` function can be executed on the data set. It has two arguments: `data` and `only_params`. `data` is a data matrix or data.frame `only_params` is logical. If `only_params==FALSE` (default), `fn` will return the object returned by `pam`. If `only_params==TRUE` (default) `fn` will return only cluster parameters (proportions, mean, and cov, see clust2params.

References

Coraggio, Luca, and Pietro Coretto (2023). Selecting the Number of Clusters, Clustering Models, and Algorithms. A Unifying Approach Based on the Quadratic Discriminant Score. Journal of Multivariate Analysis, Vol. 196(105181), pp. 1-20, \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1016/j.jmva.2023.105181")}

Examples

# 'pam' settings combining number of clusters K={2,3}, and dissimilarities {euclidean, manhattan}
A <- mset_pam(K = c(2,3), metric = c("euclidean", "manhattan"))
   
# select setup 1: K=2, metric = "euclidean"
m <- A[[1]]
print(m)

      
# cluster with the method set in 'm'
data("banknote")
dat  <- banknote[-1]
fit1 <- m$fn(dat)   
fit1
class(fit1)


# if only cluster parameters are needed
fit1b <- m$fn(dat, only_params = TRUE)   
fit1b

qcluster documentation built on April 3, 2025, 6:16 p.m.