mset_gmix: Generates Methods Settings for Gaussian Mixture Model-Based...

View source: R/mset_gmix.R

mset_gmixR Documentation

Generates Methods Settings for Gaussian Mixture Model-Based Clustering

Description

The function generates a software abstraction of a list of clustering models implemented through a set of tuned methods and algorithms. In particular, it generates a list ofgmix -type functions each combining model tuning parameters and other algorithmic settings. The generated functions are ready to be called on the data set.

Usage

mset_gmix(
   K = seq(10),
   init = "kmed",
   erc = c(1, 50, 1000),
   iter.max = 1000,
   tol = 1e-8,
   init.nstart = 25, 
   init.iter.max = 30,
   init.tol = tol)

Arguments

K

a vector/list, specifies the number of clusters.

init

a vector, contains the settings of the init parameter of gmix.

erc

a vector/list, contains the settings of the erc parameter of gmix.

iter.max

a integer vector, contains the settings of the iter.max parameter of gmix.

tol

a vector/list, contains the settings of the tol parameter of gmix.

init.nstart

a integer vector, contains the settings of the init.start parameter of gmix.

init.iter.max

a integer vector, contains the settings of the init.iter.max parameter of gmix.

init.tol

a vector/list, contains the settings of the init.tol parameter of gmix.

Details

The function produces functions implementing competing clustering methods based on several Gaussian Mixture models specifications. The function produces functions for fitting competing Gaussian Mixture model-based clustering methods settings. This is a specialized version of the more general function mset_user. In particular, it produces a list of gmix functions each corresponding to a specific setup in terms of both model hyper-parameters (e.g. the number of clusters, the eigenvalue ratio constraint, etc.) and algorithm's control parameters (e.g. the type of initialization, maximum number of iteration, etc.). See gmix for a detailed description of the role of each argument and their data types.

Value

An S3 object of class 'qcmethod'. Each element of the list represents a competing method containing the following objects

fullname

a string identifying the setup.

callargs

a list with gmix function arguments.

fn

the function implementing the specified setting. This fn function can be executed on the data set. It has two arguments: data and only_params. data is a data matrix or data.frame only_params is logical. If only_params==FALSE (default), fn will return the object returned by gmix. If only_params==TRUE (default) fn will return only cluster parameters (proportions, mean, and cov, see clust2params.

References

Coraggio, Luca, and Pietro Coretto (2023). Selecting the Number of Clusters, Clustering Models, and Algorithms. A Unifying Approach Based on the Quadratic Discriminant Score. Journal of Multivariate Analysis, Vol. 196(105181), pp. 1-20, \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1016/j.jmva.2023.105181")}

See Also

gmix, mset_user, bqs

Examples

# 'gmix' settings combining number of clusters K={3,4} and eigenvalue 
# ratio constraints {1,10} 
A <- mset_gmix(K = c(2,3), erc = c(1,10))
   
# select setup 1: K=2, erc = 1, init =" kmed"
ma1 <- A[[1]]
print(ma1)

# fit M[[1]] on banknote data
data("banknote")
dat  <- banknote[-1]
fit1 <- ma1$fn(dat)   
fit1

# if only cluster parameters are needed
fit1b <- ma1$fn(dat, only_params = TRUE)   
fit1b

   
# include a custom initialization, see also help('gmix')
compute_init <- function(data, K){
  cl  <- kmeans(data, K, nstart=1, iter.max=10)$cluster
  W   <- sapply(seq(K), function(x) as.numeric(cl==x))
  return(W)
}

# generate methods settings 
B <- mset_gmix(K = c(2,3), erc = c(1,10), init=c(compute_init, "kmed"))


# select setup 2: K=2, erc=10, init = compute_init
mb2  <- B[[2]]
fit2 <- mb2$fn(dat)   
fit2

qcluster documentation built on April 3, 2025, 6:16 p.m.

Related to mset_gmix in qcluster...