The method identifies spikes with respect to a "reference" timeseries, and
replaces these spikes with the reference value, or with NA
according
to the value of action
; see “Details”.
1 2 
x 
a vector of (timeseries) values, a list of vectors, a data frame,
or an object that inherits from class 
reference 
indication of the type of reference time series to be used in the detection of spikes; see ‘Details’. 
n 
an indication of the limit to differences between 
k 
length of running median used with 
min 
minimum nonspike value of 
max 
maximum nonspike value of 
replace 
an indication of what to do with spike values, with

skip 
optional vector naming columns to be skipped. This is ignored if

Three modes of operation are permitted, depending on the value of
reference
.
For reference="median"
, the first step is to linearly interpolate
across any gaps (spots where x==NA
), using approx
with
rule=2
. The second step is to pass this through
runmed
to get a running median spanning k
elements. The result of these two steps is the "reference" timeseries.
Then, the standard deviation of the difference between x
and the reference is calculated. Any x
values that differ from
the reference by more than n
times this standard deviation are considered
to be spikes. If replace="reference"
, the spike values are
replaced with the reference, and the resultant time series is
returned. If replace="NA"
, the spikes are replaced with NA
,
and that result is returned.
For reference="smooth"
, the processing is the same as for
"median"
, except that smooth
is used to calculate the
reference time series.
For reference="trim"
, the reference time series is constructed by
linear interpolation across any regions in which x<min
or
x>max
. (Again, this is done with approx
with
rule=2
.) In this case, the value of n
is ignored, and the
return value is the same as x
, except that spikes are replaced
with the reference series (if replace="reference"
or with
NA
, if replace="NA"
.
A new vector in which spikes are replaced as described above.
Dan Kelley
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19  n < 50
x < 1:n
y < rnorm(n=n)
y[n/2] < 10 # 10 standard deviations
plot(x, y, type='l')
lines(x, despike(y), col='red')
lines(x, despike(y, reference="smooth"), col='darkgreen')
lines(x, despike(y, reference="trim", min=3, max=3), col='blue')
legend("topright", lwd=1, col=c("black", "red", "darkgreen", "blue"),
legend=c("raw", "median", "smooth", "trim"))
# add a spike to a CTD object
data(ctd)
plot(ctd)
T < ctd[["temperature"]]
T[10] < T[10] + 10
ctd[["temperature"]] < T
CTD < despike(ctd)
plot(CTD)

