spikesMetric: Find spikes using a rolling Hampel filter

View source: R/spikesMetric.R

spikesMetricR Documentation

Find spikes using a rolling Hampel filter


The spikesMetric() function determines the number of spikes in a seismic Stream.


spikesMetric(st, windowSize=41, thresholdMin=10, selectivity=NA, fixedThreshold=TRUE)



a Stream object containing a seismic signal


The window size to roll over (default=41)


Initial value for outlier detection (default=10.0)


Numeric factor [0-1] used in determining outliers, or NA if fixedThreshold=TRUE (default=NA)


TRUE or FALSE, set the threshold=thresholdMin and ignore selectivity (default=TRUE)


This function uses the output of the findOutliers() function in the seismicRoll package to calculate the number of 'spikes' containing outliers.

The thresholdMin level is similar to a sigma value for normally distributed data. Hampel filter values above 6.0 indicate a data value that is extremely unlikely to be part of a normal distribution (~ 1/500 million) and therefore very likely to be an outlier. By choosing a relatively large value for thresholdMin we make it less likely that we will generate false positives. False positives can include high frequency environmental noise.

The selectivity is a value between 0 and 1 and is used to generate an appropriate threshold for outlier detection based on the statistics of the incoming data. A lower value for selectivity will result in more outliers while a value closer to 1.0 will result in fewer. The code ignores selectivity if fixedThreshold=TRUE.

The fixedThreshold is a logical TRUE or FALSE. If TRUE, then the threshold is set to thresholdMin. If FALSE, then the threshold is set to maximum value of the roll_hample() function output multiplied by the selectivity.

The total count of spikes reflects the number of outlier data points that are separated by at least one non-outlier data point. Each individual spike may contain more than one data point.


A list of SingleValueMetric objects is returned.


The thresholdMin parameter is sensitive to the data sampling rate. The default value of 10 seems to work well with sampling rates of 10 Hz or higher ('B..' or 'H..' channels). For 'L..' channels with a sampling rate of 1 Hz thresholdMin=12.0 or larger may be more appropriate.

More testing of spiky signals at different resolutions is needed.

See the seismicRoll package for documentation on the findOutliers() function.


Jonathan Callahan jonathan@mazamascience.com


  ## Not run: 
# Open a connection to IRIS DMC webservices
iris <- new("IrisClient")

# Get the waveform
starttime <- as.POSIXct("2013-01-03 15:00:00", tz="GMT")
endtime <- starttime + 3600 * 3  
st <- getDataselect(iris,"IU","RAO","10","BHZ",starttime,endtime)

# Calculate the gaps metrics and show the results
metricList <- spikesMetric(st)
dummy <- show(metricList)
## End(Not run)

IRISMustangMetrics documentation built on April 28, 2022, 1:06 a.m.