kluster_eval: function to perform evaluation analysis on kluster...
In hestiri/kluster: A package for scalable approximation of the number of clusters

Description Usage Arguments Value Examples

View source: R/kluster_eval.R

If a specific algorithm is not specified by the user, it will perform the kluster implementations of all cluster number approximation algorithms will provide data for evaluation of the best algorithms as well as the processing time. The actual number of clusters needs to be provided for the function to calculate approximation error.

1	kluster_eval(data, clusters, iter_sim = 1, iter_klust, smpl, algorithm = "Default", cluster = FALSE)

`data`
`clusters`	number of clusters, as we know, for calculating error. This is a requirement for this function. If you don't know the number of clusters, user 'kluster' function instead.
`iter_sim`	number of simulation iterations, default at 1
`iter_klust`	number of iterations for clustering with sample_n size x
`smpl`	size of the sample_n to be taken with replacement out of data
`algorithm`	select analysis algorithm from BIC, PAMK, CAL, and AP. "Default" returns results from all available algorithms.
`cluster`	if TURE it'll do clustering which will take a lot longer! Not available for now...

returns the following values:

`sim`	For the selected algorithm, returns both the most frequent and the average approximated number of clusters produced by kluster procedure, processing time, and error
`m_bic_k,m_cal_k,m_ap_k,m_pam_k`	the average approximated number of cluster for each selected algorithm
`f_bic_k,f_cal_k,f_ap_k,f_pam_k`	the most frequent approximated number of cluster for each selected algorithm

1
2
3

dat = read.csv("data/Breast_Cancer_Wisconsin.csv")
##returning kluster's most frequent product using the BIC algorithm:
k = kluster_eval(data = dat[,c("area_mean","texture_mean")], clusters = 2, iter_sim = 1, iter_klust = 100, smpl = 100)$sim