Description Usage Arguments Details Value Author(s) Examples
This main function fills gaps in monovariate or multivariate data by SVD-imputation which is closely related to expectation-maximization (EM) algorithm.
1 2 | interpolsvd_em(y, nembed = 1, nsmo = 0, ncomp = 0, threshold1 = 1e-05,
niter = 30, displ = F)
|
y |
a numeric data.frame or matrix of data with gaps |
nembed |
integer value controlling embedding dimension (must be > 1 for monovariate data) |
nsmo |
integer value controlling cutoff time scale in number of samples. Set it to 0 if only one single time scale is desired. |
ncomp |
controls the number of significant components. It has to be specified for running in automatic mode. Default (0) leads to manual selection during the algorithm |
threshold1 |
numeric value controllingthe stop of the iterations after the relative energy change is < threshold |
niter |
numeric value controlling the maximum number of iterations |
displ |
boolean controlling the display of some information in the console during the algorithm |
The method decomposes the data into two time scales, which are processed separately and then merged at the end. The cutoff time scale (nsmo) is expressed in number of samples. A gaussian filter is used for filtering. Monovariate data must be embedded first (nembed>1). In the initial data set, gaps are supposed to be filled in with NA !!.
The three tuneable (hyper)parameters are :
ncomp
nsmo
nembed
A list with the following elements:
y.filled
The same dataset as y but with gaps filled
w.distSVD
The distribution of the weights of the initial SVD
But only the first one really affects the outcome. A separation into two scales only (with a threshold between 50–100 days) isenough to properly capture both short- and long-term evolutions, and embedding dimensions of D = 2−5 are usually adequate for reconstructing daily averages. The determination of the optimum parameters and validation of the results is preferably made by cross-validation.
Antoine Pissoort, antoine.pissoort@student.uclouvain.be
1 2 3 4 5 6 7 8 9 10 11 12 13 | # Take this for input, as advised in the test.m file
y <- sqrt(data.mat2.fin+1) # Selected randomly here, for testing
options(mc.cores=parallel::detectCores()) # all available cores
z <- interpolsvd_em(y, nembed = 2, nsmo = 81, ncomp = 4,
niter = 30, displ = F)
# 393 sec for the whole dataset (with some stations discarded)
# Then do the inverse transformation to obtain final dataset with filled values
z <- z$y.filled
z_final = z*z - 1
z_final[z_final<0] <- 0
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.