robust.filter: Robust Filtering Methods for Univariate Time Series

View source: R/robust-filter.R

robust.filterR Documentation

Robust Filtering Methods for Univariate Time Series

Description

Procedure for robust (online) extraction of low frequency components (the signal) from a univariate time series with optional rules for outlier replacement and shift detection.

Usage

robust.filter(y, width, trend = "RM", scale = "QN", outlier = "T", 
                        shiftd = 2, wshift = floor(width/2), lbound = 0.1, p = 0.9,
                        adapt = 0, max.width = width, 
                        online = FALSE, extrapolate = TRUE)

Arguments

y

a numeric vector or (univariate) time series object.

width

a positive integer defining the window width used for fitting. If online=FALSE (default) this needs to be an odd number.

trend

a character string defining the method to be used for robust approximation of the signal within one time window. Possible values are:

"MED":

Median

"RM":

Repeated Median regression (default)

"LTS":

Least Trimmed Squares regression

"LMS":

Least Median of Squares regression

scale

a character string defining the method to be used for robust estimation of the local variability (within one time window). Possible values are:

"MAD":

Median absolute deviation about the median

"QN":

Rousseeuw's and Croux' (1993) Q_n scale estimator (default)

"SN":

Rousseeuw's and Croux' (1993) S_n scale estimator

"LSH":

Length of the shortest half

outlier

a single character defining the rule to be used for outlier detection and outlier treatment. Observations deviating more than d\cdot \hat{\sigma}_t from the current level approximation \hat{\mu}_t are replaced by \hat{\mu}_t\pm k\hat{\sigma}_t where \hat{\sigma}_t denotes the current scale estimate.
Possible values are:

"T":

Replace ('trim') large outliers detected by a 3\sigma-rule (d=3) by the current level estimate (k=0). (default)

"L":

Shrink large outliers (d=3) strongly towards the current level estimate (k=1).

"M":

Shrink large and moderatly sized outliers (d=2) strongly towards the current level estimate (k=1).

"W":

Shrink large and moderatly sized outliers (d=2) towards the current level estimate (k=2).

W is the most efficient, T the most robust method (which should ideally be combined with a suitable value of lbound).

shiftd

a positive numeric value defining the factor the current scale estimate is multiplied with for shift detection. Default is shiftd=2 corresponding to a 2\sigma rule for shift detection.

wshift

a positive integer specifying the number of the most recent observations used for shift detection (regulates therefore also the delay of shift detection). Only used in the online mode; should be less than half the (minimal) window width then. In the offline mode (online=FALSE, default), shift detection is based on the right half of the time window, i.e. wshift=floor(width/2) (default).

lbound

a positive real value specifying an optional lower bound for the scale to prevent the scale estimate from reaching zero (implosion).

p

a fraction \in [2/3,1] of observations for additional rules in case of only two or three different values within one window.
If 100 percent of the observations within one window take on only two different values, the current level is estimated by the mean of these values regardless of the trend specification. In case of three differing values the median is taken as the current level estimate.

adapt

a numeric value defining the fraction which regulates the adaption of the moving window width. adapt can be either 0 or a value \in [0.6,1] . adapt = 0 means that a fixed window width is used. Otherwise, the window width is reduced whenever more than a fraction of adapt \in [0.6,1] of the residuals in a certain part of the current time window are all positive or all negative.

max.width

a positive integer (>= width) specifying the maximal width of the time window.
width specifies the minimal (and also the initial) width.

online

a logical indicating whether the current level and scale estimates are evaluated at the most recent time within each window (TRUE) or centered within the window (FALSE). online=FALSE (default) requires an odd width for the window and means a time delay of (width+1)/2 time units.

extrapolate

a logical indicating whether the level estimations should be extrapolated to the edges of the time series.
If online=FALSE the extrapolation consists of the fitted values within the first half of the first window and the last half of the last window; if online=TRUE the extrapolation consists of all fitted values within the first time window.

Details

robust.filter works by applying the methods specified by trend and scale to a moving time window of length width.

Before moving the time window, it is checked whether the next (incoming) observation is considered an 'outlier' by applying the rule specified by outlier. Therefore, the trend in the current time window is extrapolated to the next point in time and the residual of the incoming observation is standardised by the current scale estimate.

After moving the time window, it can be tested whether a level shift has occurred within the window: If more than half of the residuals in the right part of the window are larger than shiftd\cdot\sigma_t, a shift is detected and appropriate actions are taken. In the online mode, the number of the rightmost residuals can be chosen by wshift to regulate the resistance of the detection rule against outliers, its power and the time delay of detection.

A more detailed description of the filter can be found in Fried (2004). The adaption of the window width is described by Gather and Fried (2004). For more explanations on shift detection, see Fried and Gather (2007).

Value

robust.filter returns an object of class robust.filter. An object of class robust.filter is a list containing the following components:

level

a numeric vector containing the signal level extracted by the (regression) filter specified by trend, scale and outlier.

slope

a numeric vector containing the corresponding slope within each time window.

sigma

a numeric vector containing the corresponding scale within each time window.

ol

an outlier indicator. 0: no outlier, +1: positive outlier, -1: negative outlier

level.shift

a level shift indicator. 0: no level shift, t: positive level shift detected at processing time t, -t: negative level shift detected at processing time t (the position in the vector gives an estimate of the point in time before which the shift has occurred).

In addition, the original input time series is returned as list member y, and the settings used for the analysis are returned as the list members width, trend, scale, outlier, shiftd, wshift, lbound, p, adapt, max.width, online and extrapolate.

Application of the function plot to an object of class robust.filter returns a plot showing the original time series with the filtered output.

Note

Missing values have to be replaced or removed from the time series before applying robust.filter.

Author(s)

Roland Fried and Karen Schettlinger

References

Fried, R. (2004), Robust Filtering of Time Series with Trends, Journal of Nonparametric Statistics 16, 313-328.
(earlier version: http://hdl.handle.net/2003/4992)

Fried, R., Gather, U. (2007), On Rank Tests for Shift Detection in Time Series, Computational Statistics and Data Analysis, Special Issue on Machine Learning and Robust Data Mining 52, 221-233.
(earlier version: http://hdl.handle.net/2003/23301)

Gather, U., Fried, R. (2004), Methods and Algorithms for Robust Filtering, COMPSTAT 2004: Proceedings in Computational Statistics, J. Antoch (eds.), Physika-Verlag, Heidelberg, 159-170.

Schettlinger, K., Fried, R., Gather, U. (2006) Robust Filters for Intensive Care Monitoring: Beyond the Running Median, Biomedizinische Technik 51(2), 49-56.

See Also

robreg.filter, hybrid.filter, dw.filter, wrm.filter.

Examples

# Generate random time series:
y <- cumsum(runif(500)) - .5*(1:500)
# Add jumps:
y[200:500] <- y[200:500] + 5
y[400:500] <- y[400:500] - 7
# Add noise:
n <- sample(1:500, 30)
y[n] <- y[n] + rnorm(30)

# Delayed Filtering of the time series with window width 23:
y.rf <- robust.filter(y, width=23)
# Plot:
plot(y.rf)

# Delayed Filtering with different settings and fixed window width 31:
y.rf2 <- robust.filter(y, width=31, trend="LMS", scale="QN", outlier="W")
plot(y.rf2)

# Online Filtering with fixed window width 24:
y.rf3 <- robust.filter(y, width=24, online=TRUE)
plot(y.rf3)

# Delayed Filtering with adaptive window width (minimal width 11, maximal width 51):
y.rf4 <- robust.filter(y, width=11, adapt=0.7, max.width=51)
plot(y.rf4)

robfilter documentation built on Sept. 11, 2024, 6:05 p.m.