View source: R/smwrBase_fillMissing.R
fillMissing | R Documentation |
Replace missing values in time-series data by interpolation.
fillMissing(x, span = 10, Dates = NULL, max.fill = 10)
x |
the sequence of observations. Missing values are permitted and will be replaced. |
span |
the maximum number of observations on each side of each range of missing values to use in constructing the time-series model. See Details. |
Dates |
an optional vector of dates/times associated with each value
in |
max.fill |
the maximum gap to fill. |
Missing values at the beginning and end of x
will not be replaced.
The argument span
is used to help set the range of values used to
construct the StructTS
model. If span
is set small, then the
variance of epsilon dominates and the estimates are not smooth. If
span
is large, then the variance of level dominates and the estimates
are linear interpolations. The variances of level and epsilon are components
of the state-space model used to interpolate values, see StructTS
for details.
See Note for more information about the method.
If span
is set larger than 99, then the entire time series is used to
estimate all missing values. This approach may be useful if there are many
periods of missing values. If span
is set to any number less than 4,
then simple linear interpolation will be used to replace missing values.
Added from smwrBase.
The observations in x
with missing values replaced by
interpolation.
The method used to interpolate missing values is based on
tsSmooth
constructed using StructTS
on x
with
type
set to "trend." The smoothing method basically uses the
information (slope) from two values previous to missing values and the two
values following missing values to smoothly interpolate values accounting for
any change in slope. Beauchamp (1989) used time-series methods for synthesizing
missing streamflow records. The group that is used to define the statistics that
control the interpolation is very simply defined by span
rather than
the more in-depth measures described in Elshorbagy and others (2000).
If the data have gaps rather than missing values, then fillMissing will return
a vector longer than x
if Dates
is given and the return data
cannot be inserted into the original data frame. If Dates
is not given,
then the gap will be recognized and not be filled. The function
insertMissing
can be used to create a data frame with the complete
sequence of dates.
Beauchamp, J.J., 1989, Comparison of regression and time-series
methods for synthesizing missing streamflow records: Water Resources
Bulletin, v. 25, no. 5, p. 961–975.
Elshorbagy, A.A., Panu, U.S., and Simonovic, S.P., 2000, Group-based estimation of missing hydrological data, I. Approach and general methodology: Hydrological Sciences Journal, v. 45, no. 6, p. 849–866.
tsSmooth
, StructTS
## Not run:
#library(smwrData)
data(Q05078470)
# Create missing values in flow, the first sequence is a peak and the second is a recession
Q05078470$FlowMiss <- Q05078470$FLOW
Q05078470$FlowMiss[c(109:111, 198:201)] <- NA
# Interpolate the missing values
Q05078470$FlowFill <- fillMissing(Q05078470$FlowMiss)
# How did we do (line is actual, points are filled values)?
par(mfrow=c(2,1), mar=c(5.1, 4.1, 1.1, 1.1))
with(Q05078470[100:120, ], plot(DATES, FLOW, type="l"))
with(Q05078470[109:111, ], points(DATES, FlowFill))
with(Q05078470[190:210, ], plot(DATES, FLOW, type="l"))
with(Q05078470[198:201, ], points(DATES, FlowFill))
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.