despike | R Documentation |
The method identifies spikes with respect to a "reference" time-series, and
replaces these spikes with the reference value, or with NA
according
to the value of action
; see “Details”.
despike( x, reference = c("median", "smooth", "trim"), n = 4, k = 7, min = NA, max = NA, replace = c("reference", "NA"), skip )
x |
a vector of (time-series) values, a list of vectors, a data frame, or an oce object. |
reference |
indication of the type of reference time series to be used in the detection of spikes; see ‘Details’. |
n |
an indication of the limit to differences between |
k |
length of running median used with |
min |
minimum non-spike value of |
max |
maximum non-spike value of |
replace |
an indication of what to do with spike values, with
|
skip |
optional vector naming columns to be skipped. This is ignored if
|
Three modes of operation are permitted, depending on the value of
reference
.
For reference="median"
, the first step is to linearly interpolate
across any gaps (spots where x==NA
), using approx()
with
rule=2
. The second step is to pass this through
runmed()
to get a running median spanning k
elements. The result of these two steps is the "reference" time-series.
Then, the standard deviation of the difference between x
and the reference is calculated. Any x
values that differ from
the reference by more than n
times this standard deviation are considered
to be spikes. If replace="reference"
, the spike values are
replaced with the reference, and the resultant time series is
returned. If replace="NA"
, the spikes are replaced with NA
,
and that result is returned.
For reference="smooth"
, the processing is the same as for
"median"
, except that smooth()
is used to calculate the
reference time series.
For reference="trim"
, the reference time series is constructed by
linear interpolation across any regions in which x<min
or
x>max
. (Again, this is done with approx()
with
rule=2
.) In this case, the value of n
is ignored, and the
return value is the same as x
, except that spikes are replaced
with the reference series (if replace="reference"
or with
NA
, if replace="NA"
.
A new vector in which spikes are replaced as described above.
Dan Kelley
n <- 50 x <- 1:n y <- rnorm(n=n) y[n/2] <- 10 # 10 standard deviations plot(x, y, type='l') lines(x, despike(y), col='red') lines(x, despike(y, reference="smooth"), col='darkgreen') lines(x, despike(y, reference="trim", min=-3, max=3), col='blue') legend("topright", lwd=1, col=c("black", "red", "darkgreen", "blue"), legend=c("raw", "median", "smooth", "trim")) # add a spike to a CTD object data(ctd) plot(ctd) T <- ctd[["temperature"]] T[10] <- T[10] + 10 ctd[["temperature"]] <- T CTD <- despike(ctd) plot(CTD)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.