| KSWIN | R Documentation |
Implements the Kolmogorov-Smirnov test for detecting distribution changes within a window of streaming data. KSWIN is a non-parametric method for change detection that compares two samples to determine if they come from the same distribution.
KSWIN is effective for detecting changes in the underlying distribution of data streams. It is particularly useful in scenarios where data properties may evolve over time, allowing for early detection of changes that might affect subsequent data processing.
alphaSignificance level for the KS test.
window_sizeTotal size of the data window used for testing.
stat_sizeNumber of data points sampled from the window for the KS test.
windowCurrent data window used for change detection.
change_detectedBoolean flag indicating whether a change has been detected.
p_valueP-value of the most recent KS test.
new()Initializes the KSWIN detector with specific settings.
KSWIN$new(alpha = 0.005, window_size = 100, stat_size = 30, data = NULL)
alphaThe significance level for the KS test.
window_sizeThe size of the data window for change detection.
stat_sizeThe number of samples in the statistical test window.
dataInitial data to populate the window, if provided.
reset()Resets the internal state of the detector to its initial conditions.
KSWIN$reset()
add_element()Adds a new element to the data window and updates the detection status based on the KS test.
KSWIN$add_element(x)
xThe new data value to add to the window.
detected_change()Checks if a change has been detected based on the most recent KS test.
KSWIN$detected_change()
Boolean indicating whether a change was detected.
clone()The objects of this class are cloneable with this method.
KSWIN$clone(deep = FALSE)
deepWhether to make a deep clone.
Christoph Raab, Moritz Heusinger, Frank-Michael Schleif, Reactive Soft Prototype Computing for Concept Drift Streams, Neurocomputing, 2020.
Implementation: https://github.com/scikit-multiflow/scikit-multiflow/blob/a7e316d1cc79988a6df40da35312e00f6c4eabb2/src/skmultiflow/drift_detection/kswin.py
set.seed(123) # Setting a seed for reproducibility
data_part1 <- sample(c(0, 1), size = 100, replace = TRUE, prob = c(0.7, 0.3))
# Introduce a change in data distribution
data_part2 <- sample(c(0, 1), size = 100, replace = TRUE, prob = c(0.3, 0.7))
# Combine the two parts
data_stream <- c(data_part1, data_part2)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.