View source: R/DSC_evoStream.R
DSC_evoStream | R Documentation |
Micro Clusterer with reclustering. Stream clustering algorithm based on evolutionary optimization.
DSC_evoStream(
formula = NULL,
r,
lambda = 0.001,
tgap = 100,
k = 2,
crossoverRate = 0.8,
mutationRate = 0.001,
populationSize = 100,
initializeAfter = 2 * k,
incrementalGenerations = 1,
reclusterGenerations = 1000
)
formula |
|
r |
radius threshold for micro-cluster assignment |
lambda |
decay rate |
tgap |
time-interval between outlier detection and clean-up |
k |
number of macro-clusters |
crossoverRate |
cross-over rate for the evolutionary algorithm |
mutationRate |
mutation rate for the evolutionary algorithm |
populationSize |
number of solutions that the evolutionary algorithm maintains |
initializeAfter |
number of micro-cluster required for the initialization of the evolutionary algorithm. |
incrementalGenerations |
number of EA generations performed after each observation |
reclusterGenerations |
number of EA generations performed during reclustering |
The online component uses a simplified version of DBSTREAM to
generate micro-clusters. The micro-clusters are then incrementally
reclustered using an evolutionary algorithm. Evolutionary algorithms create
slight variations by combining and randomly modifying existing solutions. By
iteratively selecting better solutions, an evolutionary pressure is created
which improves the clustering over time. Since the evolutionary algorithm is
incremental, it is possible to apply it between observations, e.g. in the
idle time of the stream. Whenever there is idle time, we can call the
recluster()
function of the reference class to improve the
macro-clusters (see example). The evolutionary algorithm can also be applied
as a traditional reclustering step, or a combination of both. In addition,
this implementation also allows to evaluate a fixed number of generations
after each observation.
Matthias Carnein Matthias.Carnein@uni-muenster.de
Carnein M. and Trautmann H. (2018), "evoStream - Evolutionary Stream Clustering Utilizing Idle Times", Big Data Research.
Other DSC_Micro:
DSC_BICO()
,
DSC_BIRCH()
,
DSC_DBSTREAM()
,
DSC_DStream()
,
DSC_Micro()
,
DSC_Sample()
,
DSC_Window()
Other DSC_TwoStage:
DSC_DBSTREAM()
,
DSC_DStream()
,
DSC_TwoStage()
stream <- DSD_Gaussians(k = 3, d = 2) %>% DSD_Memory(n = 500)
## init evoStream
evoStream <- DSC_evoStream(r = 0.05, k = 3,
incrementalGenerations = 1, reclusterGenerations = 500)
## insert observations
update(evoStream, stream, n = 500)
## micro clusters
get_centers(evoStream, type = "micro")
## micro weights
get_weights(evoStream, type = "micro")
## macro clusters
get_centers(evoStream, type = "macro")
## macro weights
get_weights(evoStream, type = "macro")
## plot result
reset_stream(stream)
plot(evoStream, stream)
## if we have time, then we can evaluate additional generations.
## This can be called at any time, also between observations.
## by default, 1 generation is evaluated after each observation and
## 1000 generations during reclustering but we set it here to 500
evoStream$RObj$recluster(500)
## plot improved result
reset_stream(stream)
plot(evoStream, stream)
## get assignment of micro to macro clusters
microToMacro(evoStream)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.