smooth.FEM  R Documentation 
This function implements a spatial regression model with differential regularization. The regularizing term involves a Partial Differential Equation (PDE). In the simplest case the PDE involves only the Laplacian of the spatial field, that induces an isotropic smoothing. When prior information about the anisotropy or nonstationarity is available the PDE involves a general second order linear differential operator with possibly spacevarying coefficients. The technique accurately handle data distributed over irregularly shaped domains. Moreover, various conditions can be imposed at the domain boundaries
smooth.FEM(locations = NULL, observations, FEMbasis, covariates = NULL, PDE_parameters = NULL, BC = NULL, incidence_matrix = NULL, areal.data.avg = TRUE, search = "tree", bary.locations = NULL, family = "gaussian", mu0 = NULL, scale.param = NULL, threshold.FPIRLS = 0.0002020, max.steps.FPIRLS = 15, lambda.selection.criterion = "grid", DOF.evaluation = NULL, lambda.selection.lossfunction = NULL, lambda = NULL, DOF.stochastic.realizations = 100, DOF.stochastic.seed = 0, DOF.matrix = NULL, GCV.inflation.factor = 1, lambda.optimization.tolerance = 0.05, inference.data.object=NULL)
locations 
A #observationsby2 matrix in the 2D case and #observationsby3 matrix in the 2.5D and 3D case, where
each row specifies the spatial coordinates 
observations 
A vector of length #observations with the observed data values over the domain.
If the 
FEMbasis 
A 
covariates 
A #observationsby#covariates matrix where each row represents the covariates associated with
the corresponding observed data value in 
PDE_parameters 
A list specifying the parameters of the PDE in the regularizing term. Default is NULL, i.e.
regularization is by means of the Laplacian (stationary, isotropic case).
If the coefficients of the PDE are constant over the domain
If the coefficients of the PDE are spacevarying
For 2.5D and 3D, only the Laplacian is available ( 
BC 
A list with two vectors:

incidence_matrix 
A #regionsby#triangles/tetrahedrons matrix where the element (i,j) equals 1 if the jth
triangle/tetrahedron is in the ith region and 0 otherwise.
This is needed only for areal data. In case of pointwise data, this parameter is set to 
areal.data.avg 
Boolean. It involves the computation of Areal Data. If 
search 
a flag to decide the search algorithm type (tree or naive or walking search algorithm). 
bary.locations 
A list with three vectors:

family 
This parameter specify the distibution within exponential family used for GLM model.
The following distribution are implemented: "binomial", "exponential", "gamma", "poisson", "gaussian", "invgaussian".
The default link function for binomial is 
mu0 
This parameter is a vector that set the starting point for FPIRLS algorithm. It represent an initial guess of the location parameter.
Default is set to observation for non binary distribution while equal to 
scale.param 
Dispersion parameter of the chosen distribution. This is only required for "gamma", "gaussian", "invgaussian". User may specify the parameter as a positive real number. If the parameter is not supplied, it is estimated from data according to Wilhelm Sangalli 2016. 
threshold.FPIRLS 
This parameter is used for arresting algorithm iterations. Algorithm stops when two successive iterations lead to improvement in penalized loglikelihood smaller than threshold.FPIRLS.
Default value 
max.steps.FPIRLS 
This parameter is used to limit the maximum number of iteration.
Default value 
lambda.selection.criterion 
This parameter is used to select the optimization method for the smoothing parameter 
DOF.evaluation 
This parameter is used to identify if and how to perform degrees of freedom computation.
The following possibilities are allowed: NULL, 'exact' and 'stochastic'
In the former case no degree of freedom is computed, while the other two methods enable computation.
Stochastic computation of DOFs may be slightly less accurate than its deterministic counterpart, but it is fairly less time consuming. Stochastic evaluation is highly suggested for meshes with more than 5000 nodes.
Default value 
lambda.selection.lossfunction 
This parameter is used to determine if some loss function has to be evaluated.
The following possibilities are allowed: NULL and 'GCV' (generalized cross validation)
If NULL is selected, 
lambda 
a vector of spatial smoothing parameters provided if 
DOF.stochastic.realizations 
This positive integer is considered only when 
DOF.stochastic.seed 
This positive integer is considered only when 
DOF.matrix 
Matrix of degrees of freedom. This parameter can be used if the DOF.matrix corresponding to 
GCV.inflation.factor 
Tuning parameter used for the estimation of GCV. Default value 
lambda.optimization.tolerance 
Tolerance parameter, a double between 0 and 1 that fixes how much precision is required by the optimization method: the smaller the parameter, the higher the accuracy.
Used only if 
inference.data.object 
An 
A list with the following variables:
fit.FEM
A FEM
object that represents the fitted spatial field.
PDEmisfit.FEM
A FEM
object that represents the Laplacian of the estimated spatial field.
solution
A list, note that all terms are matrices or row vectors: the j
th column represents the vector related to lambda[j]
if lambda.selection.criterion="grid"
and lambda.selection.lossfunction=NULL
.
In all the other cases, only the column related to the best smoothing parameter is returned.
f
Matrix, estimate of function f, first half of solution vector.
g
Matrix, second half of solution vector.
z_hat
Matrix, prediction of the output in the locations.
beta
If covariates
is not NULL
, a matrix with number of rows equal to the number of covariates and number of columns equal to length of lambda. It is the regression coefficients estimate.
rmse
Estimate of the root mean square error in the locations.
estimated_sd
Estimate of the standard deviation of the error.
optimization
A detailed list of optimization related data:
lambda_solution
numerical value of best lambda according to lambda.selection.lossfunction
, 1 if lambda.selection.lossfunction=NULL
.
lambda_position
integer, position in lambda_vector
of best lambda according to lambda.selection.lossfunction
, 1 if lambda.selection.lossfunction=NULL
.
GCV
numeric value of GCV in correspondence of the optimum.
optimization_details
list containing further information about the optimization method used and the nature of its termination, eventual number of iterations.
dof
vector of positive numbers, DOFs for all the lambdas in lambda_vector
, empty or invalid if not computed.
lambda_vector
vector of positive numbers: penalizations either passed by the user or found in the iterations of the optimization method.
GCV_vector
vector of positive numbers, GCV values for all the lambdas in lambda_vector
time
Duration of the entire optimization computation.
bary.locations
Barycenter information of the given locations, if the locations are not mesh nodes.
GAM_output
A list of GAM related data:
fn_hat
A matrix with number of rows equal to number of locations and number of columns equal to length of lambda. Each column contains the evaluaton of the spatial field in the location points.
J_minima
A vector of the same length of lambda, containing the reached minima for each value of the smoothing parameter.
variance.est
A vector which returns the variance estimates for the Generative Additive Models.
inference
A list set only if a well defined [inferenceDataObject] is passed as parameter to the function; contains all inference outputs required:
p_values
list of lists set only if at least one pvalue is required; contains the pvalues divided by implementation:
wald
list containing all the Wald pvalues required, in the same order of the type
list in inference.data.object
. If oneatthetime tests are required, the corresponding item is a vector of p values ordered as the rows of coeff
matrix in inference.data.object
.
speckman
list containing all the Speckman pvalues required, in the same order of the type
list in inference.data.object
. If oneatthetime tests are required, the corresponding item is a vector of p values ordered as the rows of coeff
matrix in inference.data.object
.
eigen_sign_flip
list containing all the EigenSignFlip pvalues required, in the same order of the type
list in inference.data.object
. If oneatthetime tests are required, the corresponding item is a vector of p values ordered as the rows of coeff
matrix in inference.data.object
.
CI
list of lists set only if at least one confidence interval is required; contains the confidence intervals divided by implementation:
wald
list containing all the Wald confidence intervals required, in the same order of the type
list in inference.data.object
. Each item is a matrix with 3 columns and p rows, p being the number of rows of coeff
matrix in inference.data.object
; each row is the CI for the corresponding row of coeff
matrix.
speckman
list containing all the Speckman confidence intervals required, in the same order of the type
list in inference.data.object
. Each item is a matrix with 3 columns and p rows, p being the number of rows of coeff
matrix in inference.data.object
; each row is the CI for the corresponding row of coeff
matrix.
Sangalli, L. M., Ramsay, J. O., Ramsay, T. O. (2013). Spatial spline regression models. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 75(4), 681703.
Azzimonti, L., Sangalli, L. M., Secchi, P., Domanin, M., Nobile, F. (2015). Blood flow velocity field estimation via spatial regression with PDE penalization. Journal of the American Statistical Association, 110(511), 10571071.
Matthieu Wilhelm & Laura M. Sangalli (2016). Generalized spatial regression with differential regularization. Journal of Statistical Computation and Simulation, 86:13, 24972518.
Federico Ferraccioli, Laura M. Sangalli & Livio Finos (2021). Some first inferential tools for spatial regression with differential regularization. Journal of Multivariate Analysis, to appear
library(fdaPDE) #### No prior information about anysotropy/nonstationarity (laplacian smoothing) #### data(horseshoe2D) boundary_nodes = horseshoe2D$boundary_nodes boundary_segments = horseshoe2D$boundary_segments locations = horseshoe2D$locations mesh = create.mesh.2D(nodes = rbind(boundary_nodes, locations), segments = boundary_segments) FEMbasis = create.FEM.basis(mesh) lambda = 10^1 # no covariate data = fs.test(mesh$nodes[,1], mesh$nodes[,2]) + rnorm(nrow(mesh$nodes), sd = 0.5) solution = smooth.FEM(observations = data, FEMbasis = FEMbasis, lambda = lambda) plot(solution$fit.FEM) # with covariates covariate = covs.test(mesh$nodes[,1], mesh$nodes[,2]) data = fs.test(mesh$nodes[,1], mesh$nodes[,2]) + 2*covariate + rnorm(nrow(mesh$nodes), sd = 0.5) #Inferential tests and confidence intervals inference.data.object = inferenceDataObjectBuilder(test = 'oat', type = 'w', dim = 2, n_cov = 1) solution = smooth.FEM(observations = data, covariates = covariate, FEMbasis = FEMbasis, lambda = lambda, inference.data.object=inference.data.object) # beta estimate: solution$solution$beta # tests over beta estimates pvalues: solution$inference$beta$p_values # confidence intervals for beta estimates: solution$inference$beta$CI # nonparametric estimate: plot(solution$fit.FEM) # Choose lambda with GCV  stochastic grid evaluation: lambda = 10^(2:0) solution = smooth.FEM(observations = data, covariates = covariate, FEMbasis = FEMbasis, lambda = lambda, DOF.evaluation = 'stochastic', lambda.selection.lossfunction = 'GCV') bestLambda = solution$optimization$lambda_solution # Choose lambda with GCV  Newton finite differences stochastic evaluation : solution = smooth.FEM(observations = data, covariates = covariate, FEMbasis = FEMbasis, DOF.evaluation = 'stochastic', lambda.selection.lossfunction = 'GCV') bestLambda = solution$optimization$lambda_solution #### Smoothing with prior information about anysotropy/nonstationarity and boundary conditions #### # See Azzimonti et al. for reference to the current example data(quasicircle2D) boundary_nodes = quasicircle2D$boundary_nodes boundary_segments = quasicircle2D$boundary_segments locations = quasicircle2D$locations data = quasicircle2D$data mesh = create.mesh.2D(nodes = rbind(boundary_nodes, locations), segments = boundary_segments) FEMbasis = create.FEM.basis(mesh) lambda = 10^2 # Set the PDE parameters R = 2.8 K1 = 0.1 K2 = 0.2 beta = 0.5 K_func<function(points) { output = array(0, c(2, 2, nrow(points))) for (i in 1:nrow(points)) output[,,i]=10*rbind(c(points[i,2]^2+K1*points[i,1]^2+K2*(R^2points[i,1]^2points[i,2]^2), (K11)*points[i,1]*points[i,2]), c((K11)*points[i,1]*points[i,2], points[i,1]^2+K1*points[i,2]^2+K2*(R^2points[i,1]^2points[i,2]^2))) output } b_func<function(points) { output = array(0, c(2, nrow(points))) for (i in 1:nrow(points)) output[,i] = 10*beta*c(points[i,1],points[i,2]) output } c_func<function(points) { rep(c(0), nrow(points)) } u_func<function(points) { rep(c(0), nrow(points)) } PDE_parameters = list(K = K_func, b = b_func, c = c_func, u = u_func) # Set the boundary conditions BC = NULL BC$BC_indices = which(mesh$nodesmarkers == 1) # b.c. on the complete boundary BC$BC_values = rep(0,length(BC$BC_indices)) # homogeneus b.c. # Since the data locations are a subset of the mesh nodes for a faster solution use: dataNA = rep(NA, FEMbasis$nbasis) dataNA[mesh$nodesmarkers == 0] = data #grid evaluation solution = smooth.FEM(observations = dataNA, FEMbasis = FEMbasis, lambda = lambda, PDE_parameters = PDE_parameters, BC = BC) plot(solution$fit.FEM) image(solution$fit.FEM) # Newton's method solution = smooth.FEM(observations = dataNA, FEMbasis = FEMbasis, PDE_parameters = PDE_parameters, BC = BC) plot(solution$fit.FEM) image(solution$fit.FEM) #### Smoothing with areal data #### # See Azzimonti et al. for reference to the current exemple data(quasicircle2Dareal) incidence_matrix = quasicircle2Dareal$incidence_matrix data = quasicircle2Dareal$data mesh = quasicircle2Dareal$mesh FEMbasis = create.FEM.basis(mesh) lambda = 10^4 # Set the PDE parameters R = 2.8 K1 = 0.1 K2 = 0.2 beta = 0.5 K_func<function(points) { output = array(0, c(2, 2, nrow(points))) for (i in 1:nrow(points)) output[,,i]=10*rbind(c(points[i,2]^2+K1*points[i,1]^2+K2*(R^2points[i,1]^2points[i,2]^2), (K11)*points[i,1]*points[i,2]), c((K11)*points[i,1]*points[i,2], points[i,1]^2+K1*points[i,2]^2+K2*(R^2points[i,1]^2points[i,2]^2))) output } b_func<function(points) { output = array(0, c(2, nrow(points))) for (i in 1:nrow(points)) output[,i] = 10*beta*c(points[i,1],points[i,2]) output } c_func<function(points) { rep(c(0), nrow(points)) } u_func<function(points) { rep(c(0), nrow(points)) } PDE_parameters = list(K = K_func, b = b_func, c = c_func, u = u_func) # Set the boundary conditions BC = NULL BC$BC_indices = which(mesh$nodesmarkers == 1) # b.c. on the complete boundary BC$BC_values = rep(0,length(BC$BC_indices)) # homogeneus b.c. #grid evaluation solution = smooth.FEM(observations = data, incidence_matrix = incidence_matrix, FEMbasis = FEMbasis, lambda = lambda, PDE_parameters = PDE_parameters, BC = BC) plot(solution$fit.FEM) image(solution$fit.FEM) #Newton's method solution = smooth.FEM(observations = data, incidence_matrix = incidence_matrix, FEMbasis = FEMbasis, PDE_parameters = PDE_parameters, BC = BC) plot(solution$fit.FEM) image(solution$fit.FEM)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.