| HDDM_W | R Documentation |
Implements the Kolmogorov-Smirnov test for detecting distribution changes within a window of streaming data. KSWIN is a non-parametric method for change detection that compares two samples to determine if they come from the same distribution.
KSWIN is effective for detecting changes in the underlying distribution of data streams. It is particularly useful in scenarios where data properties may evolve over time, allowing for early detection of changes that might affect subsequent data processing.
drift_confidenceConfidence level for detecting a drift (default: 0.001).
warning_confidenceConfidence level for warning detection (default: 0.005).
lambda_optionDecay rate for the EWMA statistic, smaller values give less weight to recent data (default: 0.050).
two_side_optionBoolean flag for one-sided or two-sided error monitoring (default: TRUE).
totalContainer for the EWMA estimator and its bounded conditional sum.
sample1_decr_monitorFirst sample monitor for detecting decrements.
sample1_incr_monitorFirst sample monitor for detecting increments.
sample2_decr_monitorSecond sample monitor for detecting decrements.
sample2_incr_monitorSecond sample monitor for detecting increments.
incr_cutpointCutpoint for deciding increments.
decr_cutpointCutpoint for deciding decrements.
widthCurrent width of the window.
delayDelay count since last reset.
change_detectedBoolean indicating if a change was detected.
warning_detectedBoolean indicating if currently in a warning zone.
estimationThe current estimation of the stream's mean.
new()Initializes the HDDM_W detector with specific parameters.
HDDM_W$new( drift_confidence = 0.001, warning_confidence = 0.005, lambda_option = 0.05, two_side_option = TRUE )
drift_confidenceConfidence level for drift detection.
warning_confidenceConfidence level for issuing warnings.
lambda_optionDecay rate for the EWMA statistic.
two_side_optionWhether to monitor both increases and decreases.
add_element()Adds a new element to the data stream and updates the detection status.
HDDM_W$add_element(prediction)
predictionThe new data value to add.
SampleInfo()Provides current information about the monitoring samples, typically used for debugging or monitoring.
HDDM_W$SampleInfo()
reset()Resets the internal state to initial conditions.
HDDM_W$reset()
detect_mean_increment()Detects an increment in the mean between two samples based on the provided confidence level.
HDDM_W$detect_mean_increment(sample1, sample2, confidence)
sample1First sample information, containing EWMA estimator and bounded conditional sum.
sample2Second sample information, containing EWMA estimator and bounded conditional sum.
confidenceThe confidence level used for calculating the bound.
Boolean indicating if an increment in mean was detected.
monitor_mean_incr()Monitors the data stream for an increase in the mean based on the set confidence level.
HDDM_W$monitor_mean_incr(confidence)
confidenceThe confidence level used to detect changes in the mean.
Boolean indicating if an increase in the mean was detected.
monitor_mean_decr()Monitors the data stream for a decrease in the mean based on the set confidence level.
HDDM_W$monitor_mean_decr(confidence)
confidenceThe confidence level used to detect changes in the mean.
Boolean indicating if a decrease in the mean was detected.
update_incr_statistics()Updates increment statistics for drift monitoring based on new values and confidence. This method adjusts the cutpoint for increments and updates the monitoring samples.
HDDM_W$update_incr_statistics(value, confidence)
valueThe new value to update statistics.
confidenceThe confidence level for the update.
update_decr_statistics()Updates decrement statistics for drift monitoring based on new values and confidence. This method adjusts the cutpoint for decrements and updates the monitoring samples.
HDDM_W$update_decr_statistics(value, confidence)
valueThe new value to update statistics.
confidenceThe confidence level for the update.
clone()The objects of this class are cloneable with this method.
HDDM_W$clone(deep = FALSE)
deepWhether to make a deep clone.
Frías-Blanco I, del Campo-Ávila J, Ramos-Jimenez G, et al. Online and non-parametric drift detection methods based on Hoeffding’s bounds. IEEE Transactions on Knowledge and Data Engineering, 2014, 27(3): 810-823.
Albert Bifet, Geoff Holmes, Richard Kirkby, Bernhard Pfahringer. MOA: Massive Online Analysis; Journal of Machine Learning Research 11: 1601-1604, 2010. Implementation: https://github.com/scikit-multiflow/scikit-multiflow/blob/a7e316d1cc79988a6df40da35312e00f6c4eabb2/src/skmultiflow/drift_detection/hddm_w.py
set.seed(123) # Setting a seed for reproducibility
data_part1 <- sample(c(0, 1), size = 100, replace = TRUE, prob = c(0.7, 0.3))
# Introduce a change in data distribution
data_part2 <- sample(c(0, 1), size = 100, replace = TRUE, prob = c(0.3, 0.7))
# Combine the two parts
data_stream <- c(data_part1, data_part2)
# Initialize the HDDM_W object
hddm_w_instance <- HDDM_W$new()
# Iterate through the data stream
for(i in seq_along(data_stream)) {
hddm_w_instance$add_element(data_stream[i])
if(hddm_w_instance$warning_detected) {
message(paste("Warning detected at index:", i))
}
if(hddm_w_instance$change_detected) {
message(paste("Concept drift detected at index:", i))
}
}
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.