synthetic_data: Generating Point-level Data Having Several Groups
In HCV: Hierarchical Clustering from Vertex-Links

View source: R/HCV.R

synthetic_data

R Documentation

Generating Point-level Data Having Several Groups

Description

Generation of synthetic point-level data based on a method proposed by Lin et al. (2005).

Usage

synthetic_data(k, f, r, n, feature, geometry, homogeneity = TRUE)

Arguments

`k`	integer specifying the number of groups.
`f`	positive number controlling the concentration of generated samples toward large groups.
`r`	positive number controlling the variance of individual attributes on the feature domain.
`n`	integer specifying the total number of sampled points.
`feature`	integer specifying the number of attributes for the feature domain.
`geometry`	integer specifying the number of attributes for the geometry domain.
`homogeneity`	logical indicating whether to force the centers of the feature domain to be the same as those of the geometry domain. Default is TRUE.

Value

A list with two matrices and a vector of labels. One matrix is for the feature domain and the other is for the geometry domain, both of which have n sampled points. The vector of labels indicates which cluster each sample belongs to.

Author(s)

ShengLi Tzeng and Hao-Yun Hsu.

References

Lin, C. R., Liu, K. H., and Chen, M. S. (2005). Dual clustering: integrating data clustering over optimization and constraint domains. IEEE Transactions on Knowledge and Data Engineering, 17(5), 628-637.

Examples

set.seed(0)
pcase <- synthetic_data(3,30,0.02,100,2,2)
oldpar <- par(no.readonly = TRUE)  
par(mfrow=c(1,2))
labcolor <- (pcase$labels+1)%%3+1
plot(pcase$feat, col = labcolor, pch=19, xlab = 'First attribute', 
  ylab = 'Second attribute', main = 'Feature domain')
plot(pcase$geo, col = labcolor, pch=19, xlab = 'First attribute', 
  ylab = 'Second attribute', main = 'Geometry domain')
par(oldpar)

HCV documentation built on March 18, 2022, 6:01 p.m.