Description Usage Arguments Details Value Author(s) References See Also Examples
gapfillSSA applies the iterative gap filling procedure proposed by Kondrashov and Ghil (2006) in a fast and optimized way developed by Korobeynikov (2010). Generally spoken, major periodic components of the time series are determined and interpolated into gap positions. An iterative cross validation scheme with artificial gaps is used to determine these periodic components.
1 2 3 4 5 6 7 8 9 10 11  gapfillSSA(amnt.artgaps = c(0.05, 0.05), DetBestIter = ".getBestIteration",
debugging = FALSE, amnt.iters = c(10, 10), amnt.iters.start = c(1,
1), fill.margins = FALSE, first.guess = c(), GroupEigTrpls = "grouping.auto",
groupingMethod = "wcor", kind = c("auto", "1dssa", "2dssa")[1],
M = floor(length(series)/3), matrix.best.iter = "perf.all.gaps",
MeasPerf = "RMSE", n.comp = 2 * amnt.iters[1], open.plot = TRUE,
plot.results = FALSE, plot.progress = FALSE, pad.series = c(0,
0), print.stat = TRUE, remove.infinite = FALSE, scale.recstr = TRUE,
series, seed = integer(), size.biggap = 20, SSA.methods = c("nutrlan",
"propack", "eigen", "svd"), tresh.convergence = 0.01,
tresh.min.length = 5, z.trans.series = TRUE)

amnt.artgaps 
numeric vector: The relative ratio (amount gaps/series length) of artificial gaps to include to determine the iteration with the best prediction (c(ratio big gaps, ratio small gaps)). If this is set to c(0,0), the cross validation step is excluded and the iteration is run until amnt.iters. 
DetBestIter 
function: Function to determine the best outer and inner iteration to use for reconstruction. If no function is given, the standard way is used. (see ?.getBestIteration) 
debugging 
logical: If set to TRUE, workspaces to be used for debugging are saved in case of (some) errors or warnings. 
amnt.iters 
integer vector: Amount of iterations performed for the outer and inner loop (c(outer,inner)). 
amnt.iters.start 
integer vector: Index of the iteration to start with c(outer, inner). If this value is > 1, the reconstruction (!) part is started with this iteration. Currently it is only possible to set this to values > 1 if amnt.artgaps != 0 as this would cause a cross validation loop. 
fill.margins 
logical: Whether to fill gaps at the outer margins of the series, i.e. to extrapolate before the first and after the last valid value. Doing this most probably produces unreliable results (i.e. a strong build up of amplitude). 
first.guess 
numeric vector/matrix: First guess for the gap values. The mean/zero is used if no value is supplied. Has to have the same dimensions and lengths as series. 
GroupEigTrpls 
character string: Name of the function used to group the eigentriples. This function needs to take a ssa object as its first input and other inputs as its ... argument. It has to return a list with the length of the desired amount of SSA groups. Each of its elements has to be a integer vector indicating which SSA eigentriple(s) belong(s) to this group. The function 'grouping.auto' uses the methods supplied by the Rssa package (See argument groupingMethod to set the corresponding argument for the method). Another possibility is 'groupSSANearestNeighbour' which uses a rather adhoc method of detecting the nearest (Euclidian) neighbour of each eigentriple. 2D SSA automatically uses the nearest neighbor method as grouping was not (yet) implemented for 2D SSA. 
groupingMethod 

kind 
character string: Whether to calculate one or two dimensional SSA (see the help of ssa()). Default is to determine this automatically by determining the dimensions of series. 
M 
integer: Window length or embedding dimension [time steps]. If not given, a default value of 0.33*length(timeseries) is computed. For 2d SSA a vector of length 2 has to be supplied. If only one number is given, this is taken for both dimensions. (see ?ssa, here the parameter is called L) 
matrix.best.iter 
character string: Which performance matrix to use (has to be one of recstr.perf.a, recstr.perf.s or recstr.perf.b (see ?.getBestIteration)). 
MeasPerf 
character string: Name of a function to determine the 'goodness of fit' between the reconstruction and the actual values in the artificial gaps. The respective function has to take two vectors as an input and return one single value. Set to the "Residual Mean Square Error" (RMSE) by default. 
n.comp 
integer: Amount of eigentriples to extract (default if no values are supplied is 2*amnt.iters[1]) (see ?ssa, here the parameter is called neig). 
open.plot 
logical: Whether to open a new layout of plots for the performance plots. 
plot.results 
logical: Whether to plot performance visualization for artificial gaps? 
plot.progress 
logical: whether to visualize the iterative estimation of the reconstruction process during the calculations. 
pad.series 
integer vector (length 2): Length of the part of the series to use for padding at the start (first value) and at the end of the series. Values of zero cause no padding. This feature has not yet been rigorously tested! 
print.stat 
logical: Whether to print status information during the calculations. 
remove.infinite 
logical: Whether to remove infinite values prior to the calculation. 
scale.recstr 
logical: whether to scale the reconstruction to sd = 1 at the end of each outer loop step. 
series 
numeric vector/matrix: equally spaced input time series or matrix with gaps (gap = NA) 
seed 
integer: Seed to be taken for the randomized determination of the positions of the artificial gaps and the nutrlan ssa algorithm. Per default, no seed is set. 
size.biggap 
integer: Length of the big artificial gaps (in time steps) 
SSA.methods 
character vector: Methods to use for the SSA computation. First the first method is tried, when convergence fails the second is used and so on. See the help of ssa() in package Rssa for details on the methods. The last two methods are relatively slow! 
tresh.convergence 
numeric value: Threshold below which the last three sums of squared differences between inner iteration loops must fall for the whole process to be considered to have converged. 
tresh.min.length 
integer: minimum length the series has to have to do computations. 
z.trans.series 
logical: whether to perform ztransformation of the series prior to the calculation. 
Artificial Gaps: The amount of artificial gaps to be included is determined as follows: amnt.artgaps determines the total size of the artificial gaps to be included. The number (01) determines the number a relative ratio of the total amount of available datapoints. To switch off the inclusion of either small or biggaps, set respective ratio to 0. In general the ratios determine a maximum amount of gaps. size.biggap sets the size of the biggaps. Subsequently the number of biggaps to be included is determined by calculating the maximum possible amount of gaps of this size to reach the amount of biggaps set by amnt.artgaps[1]. The amount of small gaps is then set according to the ratio of amnt.artgaps[1]/amnt.artgaps[2].
Iteration performance measure: The DetBestIter function should take any of the RMSE matrices (small/big/all gaps) as an input and return i.best with best inner loops for each outer loop and h.best as the outer loop until which should be iterated. Use the default function as a reference.
Visualize results: If plot.per == TRUE an image plot is produced visualizing the RMSE between the artificial gaps and the reconstruction for each iteration. A red dot indicates the iteration chosen for the final reconstruction.
Padding: For padding the series should start and end exactly at the start and end of a major oscillation (e.g. a yearly cycle and the length to use for padding should be a integer multiple of this length. The padding is solved internally by adding the indicated part of the series at the start and at the end of the series. This padded series is only used internally and only the part of the series with original data is returned in the results. Padding is not (yet) possible for two dimensional SSA.
Multidimensional SSA: 1d or 2d SSA is possible. If a vector is given, one dimensional SSA is computed. In case of a matrix as input, two dimensional SSA is performed. For the two dimensional case two embedding should be given (one in the direction of each dimension). If 'big gaps' are set to be used for the cross validation, quadratic blocks of gaps with the size 'size.biggap'*'size.biggap' are inserted.
list with components
error.occoured 
logical: whether a non caught error occoured in one of the SSA calculations. 
filled.series 
numeric vector/matrix: filled series with the same length as series but without gaps. Gaps at the margins of the series can not be filled and will occur in filled.series (and reconstr). 
i.best 
integer matrix: inner loop iteration for each outer loop step in which the process has finally converged (depending on the threshold determined by tresh.convergence). If the RMSE between two inner loop iterations has been monotonously sinking (and hence, the differences between SSA iterations can be expected to be rather small), this is set to amnt.iters[2]. If not, the process most likely has been building up itself, this is set to 0. In both cases iloop.converged is set FALSE. 
iloop.converged 
logical matrix: Whether each outer loop iteration has converged (see also i.best). 
iter.chosen 
integer vector: iterations finally chosen for the reconstruction. 
perf.all.gaps 
numeric matrix: performance (RMSE) for the filling of all artificial gaps. 
perf.small.gaps 
numeric matrix: performance (RMSE) for the filling of the small artificial gaps. 
perf.big.gaps 
numeric matrix: performance (RMSE) for the filling of the big artificial gaps. 
process.converged 
logical: Whether the whole process has converged. For simplicity reasons, this only detects whether the last outer loop of the final filling process has converged. 
reconstr 
numeric vector/matrix: filtered series or reconstruction finally used to fill gaps. 
recstr.diffsum 
numeric matrix: RMSE between two consecutive inner loop iterations. This value is checked to be below tresh.convergence to determine whether the process has converged. 
settings 
list: settings used to perform the calculation. 
Jannis v. Buttlar
Kondrashov, D. & Ghil, M. (2006), Spatiotemporal filling of missing points in geophysical data sets, Nonlinear Processes In Geophysics,S 2006, Vol. 13(2), pp. 151159 Korobeynikov, A. (2010), Computation and spaceefficient implementation of SSA. Statistics and Its Interface, Vol. 3, No. 3, Pp. 257268
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25  ## create series with gaps
series.ex < sin(2 * pi * 1:1000 / 100) + 0.7 * sin(2 * pi * 1:1000 / 10) +
rnorm(n = 1000, sd = 0.4)
series.ex[sample(c(1:1000), 30)] < NA
series.ex[c(seq(from = sample(c(1:1000), 1), length.out = 20),
seq(from = sample(c(1:1000), 1), length.out = 20))]<NA
indices.gaps < is.na(series.ex)
## prepare graphics
layout(matrix(c(1:5, 6, 6, 6, 6, 6, 7, 7, 7, 7, 7), ncol = 5, byrow = TRUE),
widths = c(1, 1, 1, 0.1, 0.1))
par(mar = c(2, 0, 0, 0.2), oma = c(0, 3, 2, 0.2), tcl = 0.2, mgp = c(0, 0, 100),
las = 1)
## perform gap filling
data.filled < gapfillSSA(series = series.ex, plot.results = TRUE, open.plot = FALSE)
## plot series and filled series
plot(series.ex, xlab = '', pch = 16)
plot(data.filled$filled.series, col = indices.gaps+1, xlab = '', pch = 16)
points(data.filled$reconstr, type = 'l', col = 'blue')
mtext(side = 1, 'Index', line = 2)
legend(x = 'topright', merge = TRUE, pch = c(16, 16, NA), lty = c(NA, NA, 1),
col = c('black', 'red', 'blue'),
legend = c('original values', 'gap filled values', 'reconstruction'))

Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.