Description Usage Arguments Value Examples
With the help of TraMineR package, CLARA clustering provide a clustering of big dataset.
The main objective is to cluster state sequences with the "LCS" distance calculation method to find the best partition in N clusters.
1 2 3 4 5 6 7 8 9 10 11 | clara_clust(
data,
nb_sample = 100,
size_sample = 40 + 2 * nb_cluster,
nb_cluster = 4,
distargs = list(method = "LCS"),
plot = FALSE,
find_best_method = "Distance",
with.diss = TRUE,
cores = detectCores() - 1
)
|
data |
The dataset to use. In case of sequences, use seqdef (from TraMineR package) to create such an object. |
nb_sample |
The number of subsets to test. |
size_sample |
The size of each subset |
nb_cluster |
The number of medoids |
distargs |
List with method parameters to apply. (See the function seqdist in TraMineR package) |
plot |
Boolean variable to plot the result of clustering |
find_best_method |
Method to select the best subset. "Distance" is for the mean distance and "DB" is for Davies-Bouldin value. |
with.diss |
Boolean if the distance matrix should be returned |
cores |
Number of cores to use for parallelism |
An object of class clara_seq
1 2 3 4 5 6 7 8 9 10 11 12 | #creating sequences
library(TraMineR)
data(mvad)
mvad.labels <- c("employment", "further education", "higher education","joblessness", "school", "training")
mvad.scode <- c("EM", "FE", "HE", "JL", "SC", "TR")
mvad.seq <- seqdef(mvad, 17:86, states = mvad.scode,abels = mvad.labels, xtstep = 6)
#CLARA Clustering
my_cluster <- clara_clust(mvad.seq,nb_cluster = 4, nb_sample = 10, size_sample = 20, with.diss = TRUE)
#CLARA Clustering with Davies-Bouldin Method
my_cluster <- clara_clust(mvad.seq,nb_cluster = 4, nb_sample = 10, size_sample = 20, with.diss = TRUE, find_best_method = "DB")
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.