ImputeEOF | R Documentation |
Imputes missing values via Data Interpolating Empirical Orthogonal Functions (DINEOF).
ImputeEOF(
formula,
max.eof = NULL,
data = NULL,
min.eof = 1,
tol = 0.01,
max.iter = 10000,
validation = NULL,
verbose = interactive()
)
formula |
a formula to build the matrix that will be used in the SVD decomposition (see Details) |
max.eof , min.eof |
maximum and minimum number of singular values used for imputation |
data |
a data.frame |
tol |
tolerance used for determining convergence |
max.iter |
maximum iterations allowed for the algorithm |
validation |
number of points to use in cross-validation (defaults to the maximum of 30 or 10% of the non NA points) |
verbose |
logical indicating whether to print progress |
Singular values can be computed over matrices so formula
denotes how
to build a matrix from the data. It is a formula of the form VAR ~ LEFT | RIGHT
(see Formula::Formula) in which VAR is the variable whose values will
populate the matrix, and LEFT represent the variables used to make the rows
and RIGHT, the columns of the matrix.
Think it like "VAR as a function of LEFT and RIGHT".
Alternatively, if value.var
is not NULL
, it's possible to use the
(probably) more familiar data.table::dcast formula interface. In that case,
data
must be provided.
If data
is a matrix, the formula
argument is ignored and the function
returns a matrix.
A vector of imputed values with attributes eof
, which is the number of
singular values used in the final imputation; and rmse
, which is the Root
Mean Square Error estimated from cross-validation.
Beckers, J.-M., Barth, A., and Alvera-Azcárate, A.: DINEOF reconstruction of clouded images including error maps – application to the Sea-Surface Temperature around Corsican Island, Ocean Sci., 2, 183-199, \Sexpr[results=rd]{tools:::Rd_expr_doi("10.5194/os-2-183-2006")}, 2006.
library(data.table)
data(geopotential)
geopotential <- copy(geopotential)
geopotential[, gh.t := Anomaly(gh), by = .(lat, lon, month(date))]
# Add gaps to field
geopotential[, gh.gap := gh.t]
set.seed(42)
geopotential[sample(1:.N, .N*0.3), gh.gap := NA]
max.eof <- 5 # change to a higher value
geopotential[, gh.impute := ImputeEOF(gh.gap ~ lat + lon | date, max.eof,
verbose = TRUE, max.iter = 2000)]
library(ggplot2)
ggplot(geopotential[date == date[1]], aes(lon, lat)) +
geom_contour(aes(z = gh.t), color = "black") +
geom_contour(aes(z = gh.impute))
# Scatterplot with a sample.
na.sample <- geopotential[is.na(gh.gap)][sample(1:.N, .N*0.1)]
ggplot(na.sample, aes(gh.t, gh.impute)) +
geom_point()
# Estimated RMSE
attr(geopotential$gh.impute, "rmse")
# Real RMSE
geopotential[is.na(gh.gap), sqrt(mean((gh.t - gh.impute)^2))]
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.