DC Offset Detection

Description

The dailyDCOffsetMetric() function identifies days with a jump in the signal mean.

Usage

1
2
3
4
5
dailyDCOffsetMetric(df, 
                    offsetDays=5,
                    outlierWindow=7,
                    outlierThreshold=3.0,
                    outputType=1)

Arguments

df

a dataframe containing sample_mean values obtained with getSingleValueMetrics()

offsetDays

number of days used in calculating weighting factors

outlierWindow

window size passed to findOutliers() function in the seismic package

outlierThreshold

detection threshold passed to findOutliers() function in the seismic package

outputType

if 1, return last day of valid values (index= length(index)-floor(outlierWindow/2)); if 0, return all valid values (indices= max(offsetDays, floor(outlierWindow/2): length(index)-floor(outlierWindow/2))

Details

This algorithm calculates lagged differences of the daily mean timeseries over a window of offsetDays days. Shifts in the mean that are persistent and larger than the typical standard deviation of daily means will generate higher metric values.

Details of the algorithm are as follows

1
2
3
4
5
6
7
# data0 = download requested daily means (in the 'df' dataframe), must be greater than max(offsetDays,outlierWindow)+floor(outlierWindow/2)
# data1 = remove outliers using MAD outlier detection with the 'outlier' arguments specified
# data2 = replace outliers with rolling median values using a default 7 day window, remove last floor(outlierWindow/2) number of samples.
# weights = calculate absolute lagged differences with 1-N day lags, default N=5
# metric0 = multiply the lagged differences together and take the N'th root
# stddev0 = calculate the rolling standard deviation of data2 with a N-day window
# METRIC = divide metric0 by the median value of stddev0

Value

A list is returned with a SingleValueMetric object for the last day-floor(outlierWindow/2) (default 3rd from last day) in the incoming dataframe if outputType=1 (one list element), otherwise the first+offsetDays to last day-floor(outlierWindow/2) (multiple list elements, one per day) if outputType=0.

Note

Prefer 60+ days of sample_mean values to get a good estimate of the long term sample_mean standard deviation. After initial testing on stations in the IU network, a metric value > 10 appears to be indicative of a DC offset shift (this may vary across stations or networks and larger values may be preferred as indications of a potential station issue).

Author(s)

Jonathan Callahan jonathan@mazamascience.com

See Also

getSingleValueMetrics

Want to suggest features or report bugs for rdrr.io? Use the GitHub issue tracker.