| despike | R Documentation |
The method identifies spikes with respect to a "reference" time-series, and
replaces these spikes with the reference value, or with NA according
to the value of action; see “Details”.
despike(
x,
reference = c("median", "smooth", "trim"),
n = 4,
k = 7,
min = NA,
max = NA,
replace = c("reference", "NA"),
skip
)
x |
a vector of (time-series) values, a list of vectors, a data frame, or an oce object. |
reference |
indication of the type of reference time series to be used in the detection of spikes; see “Details”. |
n |
an indication of the limit to differences between |
k |
length of running median used with |
min |
minimum non-spike value of |
max |
maximum non-spike value of |
replace |
an indication of what to do with spike values, with
|
skip |
optional vector naming columns to be skipped. This is ignored if
|
Three modes of operation are permitted, depending on the value of
reference.
For reference="median", the first step is to linearly interpolate
across any gaps (spots where x==NA), using approx() with
rule=2. The second step is to pass this through
runmed() to get a running median spanning k
elements. The result of these two steps is the "reference" time-series.
Then, the standard deviation of the difference between x
and the reference is calculated. Any x values that differ from
the reference by more than n times this standard deviation are considered
to be spikes. If replace="reference", the spike values are
replaced with the reference, and the resultant time series is
returned. If replace="NA", the spikes are replaced with NA,
and that result is returned.
For reference="smooth", the processing is the same as for
"median", except that smooth() is used to calculate the
reference time series.
For reference="trim", the reference time series is constructed by
linear interpolation across any regions in which x<min or
x>max. (Again, this is done with approx() with
rule=2.) In this case, the value of n is ignored, and the
return value is the same as x, except that spikes are replaced
with the reference series (if replace="reference" or with
NA, if replace="NA".
A new vector in which spikes are replaced as described above.
Dan Kelley
n <- 50
x <- 1:n
y <- rnorm(n = n)
y[n / 2] <- 10 # 10 standard deviations
plot(x, y, type = "l")
lines(x, despike(y), col = "red")
lines(x, despike(y, reference = "smooth"), col = "darkgreen")
lines(x, despike(y, reference = "trim", min = -3, max = 3), col = "blue")
legend("topright",
lwd = 1, col = c("black", "red", "darkgreen", "blue"),
legend = c("raw", "median", "smooth", "trim")
)
# add a spike to a CTD object
data(ctd)
plot(ctd)
T <- ctd[["temperature"]]
T[10] <- T[10] + 10
ctd[["temperature"]] <- T
CTD <- despike(ctd)
plot(CTD)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.