Description Usage Arguments 1. Semiparametric model 2. Mean model 3. Covariance model and nugget effect 4. Penalized maximum likelihood estimation 5. Penalties 6. Mixed model representation Author(s)
Fit a spatial semiparametric model based on splines including unknown changes in the pattern of response.
1 2 3 4 |
formula |
formula. An expression to specify the model to fit. See 2. Mean model. |
data |
sss class object. A dataset object generated by any of the commands as.sss or create.sss. |
initial |
named list. The starting values for the covariance parameters of the model. If |
contrasts |
character. A contrast method for factor covariates. Default to contr.treatment. |
model |
character. Name of the semivariogram model to estimate for the spatial dependence. See Semivariogram Model. |
fix.nugget |
logical or numeric. If |
fix.kappa |
logical or numeric vector. If |
nugget.tol |
numeric. Threshold for microscale spatial variations to define the nugget effect. Default to |
angle |
numeric. Angle for geometric anisotropy. Note that this overwrites any specification for |
ratio |
numeric. Ratio between [0,1] for geometric anisotropy. Note that this overwrites any specification for |
use.reml |
logical. For using REML estimation set to |
use.profile |
logical. For profiling set to |
chMaxiter |
integer. Maximum number of iterations for the loop estimating changes in the pattern of response. |
control |
named list. Options to control the optimization. See argument |
Assume that the response variable admit the trend surface model
Y(s) = a^T b + g(s) + ε(s)
where a is a known vector of covariates and b their coefficients; g(s) is a deterministic bivariate spline and ε(s) is a Gaussian spatial process (GSP) with mean zero and covariance depending only on the distance h and given by Cov(ε(s+h),ε(s)). This model is also called a trend surface model. Given n observed locations s_1,…,s_n\in S\subset\Re^2 in a two-dimensional space, then the model is
Y = Ab + g + ε
where Y = (Y(s_1), …, Y(s_n))^T, A is the known matrix of covariates, g = (g(s_1),…,g(s_n))^T and ε = (ε(s_1),…,ε(s_n))^T. The covariance matrix is given by Cov(ε,ε) = Σ = σ^2 R with R a valid correlation matrix. Thus Y\sim N_n(μ,Σ) where μ = Ab + g and the likelihood function is L(b,g,σ^2,θ;Y) = (2π)^{-n/2}|Σ|^{-1/2}\exp\{-\frac{1}{2}(Y-μ)^TΣ^{-1}(Y-μ)\} with θ the parameters that define the correlation matrix R.
It can be defined by the commands:
linear
that defines the covariates in the matrix A. Note that more than one linear
command can be defined. See linear.
cp
defines changes in the pattern of response by including the covariates (z_d - ψ^{(0)}_d)\times 1\{z_d>ψ^{(0)}_d\} and -1\{z_d>ψ^{(0)}_d\} for d=1,…,G into the matrix A. Note that more than one cp
command can be defined. See cp.
s2D
define the bivariate spline g. Note that only one s2D
command can be defined. See s2D.
Given a distance h define u = ||T^{1/2}_{\texttt{\tiny angle, ratio}}h|| = (h^T T_{\texttt{\tiny angle,ratio}} h)^{1/2}\in\Re where T_{\texttt{\tiny angle, ratio}} is a rotation matrix for geometric anisotropy. The errors are given by the process ε(s) = η(s) + ξ(s) where ξ is a GSP with mean zero and covariance
\begin{array}{ll}Cov(ξ(s),ξ(s+h)) & = C_ξ(u;φ,κ) \\ & = σ^2_0ρ_ξ(u;φ,κ)\end{array}
with ρ_ξ(u;φ,κ) the correlation function; and η is a nugget effect with covariance
\begin{array}{ll}Cov(η(s),η(s+h)) & = C_η(u;τ^2,\texttt{tol.nugget}) \\ & = τ^2ρ_η(u;\texttt{tol.nugget})\end{array}
with correlation function ρ_η(u;\texttt{tol.nugget}) = 1\{u<\texttt{tol.nugget}\}. Therefore the covariance of the process ε is given by
\begin{array}{ll} Cov(ε(s),ε(s+h)) & = C_ε(u;σ^2,θ,\texttt{tol.nugget}) \\ & = σ^2 ρ_ε(u;θ,\texttt{tol.nugget})\end{array}
with correlation function given by
ρ_ε(u;θ,\texttt{tol.nugget}) = (1-ρ_*)ρ_η(u;\texttt{tol.nugget}) + ρ_*ρ_ξ(u;φ,κ)
where θ = (ρ_*,φ,κ)^T are the parameters with ρ_* = σ^2_0/σ^2, σ^2 = τ^2 + σ^2_0, and tol.nugget
is the argument that controls the largest distance at which micro-scale variations can affect the observed outcome. By default tol.nugget
is set to 1.0e-15
. The parameters φ,κ define the correlation function of the process ξ with φ usually called the range parameter and κ depending on the model selected. The semivariogram can be expressed as
γ_ε(u;σ^2,θ,\texttt{tol.nugget}) = σ^2(1-ρ_ε(u;θ,\texttt{tol.nugget}))
where τ^2 is the nugget effect, σ^2 is the sill, and σ^2_0 is the partial sill.
Note that when angle = 0
and ratio = 1
the matrix T_{\texttt{\tiny angle,ratio}} is an identity matrix and u = h so the correlation ρ_ε(u;θ,\texttt{tol.nugget}) is isotropic. Use different values for angle
and ratio
to define a geometric anisotropic correlation function. Then the covariance matrix Σ = σ^2 R where R is the correlation matrix originated from ρ_ε(u;θ,\texttt{tol.nugget}).
It is possible to define the argument model=name
where name
is one of the following: matern, powered.exponential, spherical, wave, exponential, gaussian, cubic, circular, gencauchy, cauchy, RMmatern, RMwhittle, RMgneiting, and RMnugget. For .semiVar
one of matern, gaussian, exponential, power, cubic, penta.spherical, spherical, wave, sin.hole, pure.nugget and identity. By default the covariance model is set to exponential with angle=0
and ratio=1
.
Estimation can be performed by maximisation with respect to b,g,σ^2,θ, and α of the penalized log likelihood
\ell_p(b,g,σ^2,θ,α) = \log(L(b,g,σ^2,θ;Y)) - \frac{1}{2σ^2} J_α(g)
where J_α(g) = g^T Q_α g is the penalty and Q_α is the roughness matrix.
Depending on the type of spline assumed for g the penalty is defined differently depending on the roughness matrix Q_α which is given by:
Given τ_{1,1},…,τ_{1,K_1} and τ_{2,1},…,τ_{2,K_2} the design points in each coordinate then
Q_α = α_1 I_{K_2}\otimes Q_1 + α_2 Q_2\otimes I_{K_1}
where Q_1, Q_2 are unidimensional roughness matrices from the design points in each coordinate and α_1, α_2 are smoothing parameters in each direction.
Given the n locations, Q_α = α E where α is the smoothing parameter and the n\times n matrix E has elements E_{i,j} = \vartheta(||s_i - s_j||) for i,j=1,…,n where
\vartheta(u) = ≤ft\{\begin{array}{ll}\frac{1}{16π}\times u^2\log(u^2) & \textrm{ , $u>0$} \\ 0 & \textrm{ , otherwise.}\end{array}\right.
The spline can be written as g = Xβ + Zr with β and r the coefficients and X and Z design matrices conveniently defined. Then for the observed responses the model can be expressed as a the mixed model
Y = Ab + Xβ + Zr + ε
where r \sim Normal(0,I_V) with V the number of columns in Z. Then, Y\sim N_n(μ_m, Σ) where μ_m = Ab + Xβ and Σ = σ^2 R; and Y|r \sim N_n(μ,V) where μ = Ab + Xβ + Zr and V = ZZ^T + Σ. Let us denote \vartheta = (b,β,σ^2,θ,α)^T, then the conditional log-likelihood of the model is given by
\ell(\vartheta|r) \propto -\frac{1}{2}≤ft\{\log|Σ|+(Y-μ)^TΣ^{-1}(Y-μ)\right\}
and the marginal log-likelihood is given by
\ell(\vartheta) \propto -\frac{1}{2}≤ft\{\log|V|+(Y-Ab-Xβ)^T V^{-1}(Y-Ab-Xβ)\right\}.
Mario A. Martinez Araya, r@marioma.me
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.