DatabionicSwarmClustering: Databionic Swarm (DBS) Clustering and Visualization

Description Usage Arguments Details Value Note Author(s) References See Also Examples

View source: R/DatabionicSwarmClustering.R

Description

Swarm-based clustering by exploting self-organization, emergence, swarm intelligence and game theory published in [Thrun/Ultsch, 2021].

Usage

1
2
3
4
5
DatabionicSwarmClustering(DataOrDistances, ClusterNo = 0,

StructureType = TRUE, DistancesMethod = NULL,

PlotTree = FALSE, PlotMap = FALSE,PlotIt=FALSE,Data)

Arguments

DataOrDistances

Either nonsymmetric [1:n,1:d] numerical matrix of a dataset to be clustered. It consists of n cases of d-dimensional data points. Every case has d attributes, variables or features.

or

symmetric [1:n,1:n] distance matrix, e.g. as.matrix(dist(Data,method))

ClusterNo

Number of Clusters, if zero a the topographic map is ploted. Number of valleys equals number of clusters.

StructureType

Either TRUE or FALSE, has to be tested against the visualization. If colored points of clusters a divided by mountain ranges, parameter is incorrect.

DistancesMethod

Optional, if data matrix given, annon Euclidean distance can be selected

PlotTree

Optional, if TRUE: dendrogram is plotted.

PlotMap

Optional, if TRUE: topographic map is plotted if GeneralizedUmatrix is installed.

PlotIt

Default: FALSE, If TRUE and dataset of [1:n,1:d] dimensions then a plot of the first three dimensions of the dataset with colored three-dimensional data points defined by the clustering stored in Cls will be generated.

Data

[1:n,1:d] data matrix in the case that DataOrDistances is missing and partial matching does not work.

Details

This function does not enable the user first to project the data and then to test the Boolean parameter defining the type of structure contrary to the DatabionicSwarm which is an inappropriate approach in case of exploratory data analysis.

Instead, this function is implemented for the purpose of automatic benchmarking because in such a case nobody will investigate many trials with one visualization per trial.

If one would like to perform a clustering exploratively (in the sense that a prior clustering is not given for evaluation purposes), then please use the DatabionicSwarm package directly and read the vignette there. Databionic swarm is like k-means a stochastic algorithm meaning that the clustering and visualization may change between trials.

Value

List of

Cls

1:n numerical vector of numbers defining the classification as the main output of the clustering algorithm for the n cases of data. It has k unique numbers representing the arbitrary labels of the clustering.

Object

List of further output of DBS

Note

Current implementation is not efficient enough to cluster more than N=4000 cases as in that case it takes longer than a day for a result.

Author(s)

Michael Thrun

References

[Thrun/Ultsch, 2021] Thrun, M. C., and Ultsch, A.: Swarm Intelligence for Self-Organized Clustering, Artificial Intelligence, Vol. 290, pp. 103237, doi: 10.1016/j.artint.2020.103237, 2021.

[Thrun/Ultsch, 2021] Thrun, M. C., & Ultsch, A.: Swarm Intelligence for Self-Organized Clustering (Extended Abstract), in Bessiere, C. (Ed.), 29th International Joint Conference on Artificial Intelligence (IJCAI), Vol. IJCAI-20, pp. 5125–5129, doi: 10.24963/ijcai.2020/720, Yokohama, Japan, Jan., 2021.

See Also

Pswarm, DBSclustering,GeneratePswarmVisualization

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
# Generate random but small non-structured data set
data = cbind(
  sample(1:100, 300, replace = TRUE),
  sample(1:100, 300, replace = TRUE),
  sample(1:100, 300, replace = TRUE)
)
# Make sure there are no structures
# (sample size is small and still could generate structures randomly)
if(requireNamespace('DatabionicSwarm',quietly = TRUE)){
Data = DatabionicSwarm::RobustNormalization(data, Centered = TRUE)
#DataVisualizations::Plot3D(Data)

# No structres are visible
# Topographic map looks like "egg carton"
# with every point in its own valley
Cls = DatabionicSwarmClustering(Data, 0, PlotMap = TRUE)
}else{
# only for testing purposes of CRAN!
# in case CRAN tests with no suggest packages available
# please use alpways some kind of standardization!
Cls = DatabionicSwarmClustering(data, 0, PlotMap = TRUE)
}


# Distance based cluster structures
# 7 valleys are visible, thus ClusterNo=7

data(Hepta)
#DataVisualizations::Plot3D(Hepta$Data)

Cls = DatabionicSwarmClustering(Hepta$Data, 0, PlotMap = TRUE)


#entagled, complex, and non-linear seperable structures 
## Not run: 
#takes too long for CRAN tests
data(Chainlink)
#DataVisualizations::Plot3D(Chainlink$Data)

# 2 valleys are visible, thus ClusterNo=2
Cls = DatabionicSwarmClustering(Chainlink$Data, 0, PlotMap = TRUE)

# Experiment with parameter StructureType only
# reveals that clustering is appropriate
# if StructureType=FALSE
Cls = DatabionicSwarmClustering(Chainlink$Data,
                                2,
                                StructureType = FALSE,
                                PlotMap = TRUE)

# Here clusters (colored points)
# are not seperated by valleys
Cls = DatabionicSwarmClustering(Chainlink$Data,
                                2,
                                StructureType = TRUE,
                                PlotMap = TRUE)

## End(Not run)

FCPS documentation built on July 8, 2021, 1:06 a.m.