pkern_var: Generate a covariance matrix or its factorization
In deankoch/pkern: Fast Kriging with Product Kernels

pkern_var

R Documentation

Generate a covariance matrix or its factorization

Description

Computes the covariance matrix V (or one of its factorizations) for the non-NA points in grid g_obs, given the model parameters list pars

Usage

pkern_var(
  g_obs,
  pars = NULL,
  scaled = FALSE,
  fac_method = "none",
  X = NULL,
  fac = NULL,
  sep = FALSE
)

Arguments

`g_obs`	list of form returned by `pkern_grid` (with entries 'gdim', 'gres', 'gval')
`pars`	list of form returned by `pkern_pars` (with entries 'y', 'x', 'eps', 'psill')
`scaled`	logical, whether to scale by `1/pars$psill`
`fac_method`	character, the factorization to return, one of 'none', 'chol', 'eigen'
`X`	numeric matrix, the `X` in `t(X) %% V %% X` (default is identity)
`fac`	matrix or list of matrices, the variance factorization (only used with X)
`sep`	logical, indicating to return correlation components instead of full covariance matrix

Details

By default the output matrix is V. Alternatively, if X is supplied, the function returns the quadratic form X^T V^-1 X.

When fac_method=='eigen' the function instead returns the eigen-decomposition of the output matrix, and when fac_method=='chol' its lower triangular Cholesky factor is returned. Supplying this factorization in argument fac in a subsequent call with X can speed up calculations. fac is ignored when X is not supplied.

scaled=TRUE returns the matrix scaled by the reciprocal of the partial sill, 1/pars$psill, before factorization. This is the form expected by functions pkern_var_mult and pkern_LL in argument fac.

If all grid points are observed, then the output V becomes separable. Setting sep=TRUE in this case causes the function to returns the x and y component correlation matrices (or their factorizations, as requested in fac_method) separately, in a list. scaled has no effect in this output mode. Note also that sep has no effect when X is supplied, as the quadratic form is not generally separable (regardless of separability in V^-1).

Missing data are identified by looking for NAs in the data vector g_obs$gval. If all are NA (or if 'gval' is missing from g_obs), the function behaves as though all grid points are observed. For multi-layer input, NAs are instead determined from g_obs$idx_grid and 'gval' is ignored (see ?pkern_grid).

Note: when pars$eps>0, the 'eigen' factorization method will be more robust than 'chol' in handling numerical precision issues with poorly conditioned covariance matrices.

Value

either matrix V, or X^T V^-1 X, or a factorization ('chol' or 'eigen')

Examples

# define example grid with NAs and example predictors matrix
gdim = c(12, 13)
n = prod(gdim)
n_obs = floor(n/3)
idx_obs = sort(sample.int(n, n_obs))
g = g_obs = pkern_grid(gdim)
g_obs$gval[idx_obs] = rnorm(n_obs)

# example kernel
psill = 0.3
pars = pkern_pars(g_obs) |> modifyList(list(psill=psill))

# plot the covariance matrix for observed data, its cholesky factor and eigen-decomposition
V_obs = pkern_var(g_obs, pars)
V_obs_chol = pkern_var(g_obs, pars, fac_method='chol')
V_obs_eigen = pkern_var(g_obs, pars, fac_method='eigen')
pkern_plot(V_obs)
pkern_plot(V_obs_chol)
pkern_plot(V_obs_eigen$vectors)

# case when there are no NAs (or no data at all)
g_nodata = modifyList(g_obs, list(gval=NULL))

# get the full covariance matrix with sep=FALSE (default)
V_full = pkern_var(g_nodata, pars)
max(abs( V_obs - V_full[idx_obs, idx_obs] ))

# get 1d correlation matrices with sep=TRUE...
corr_components = pkern_var(g_nodata, pars, sep=TRUE)
str(corr_components)

# ... these are related to the full covariance matrix by psill and eps
corr_mat = kronecker(corr_components[['x']], corr_components[['y']])
V_full_compare = pars$psill * corr_mat + diag(pars$eps, n)
max(abs(V_full - V_full_compare))

# ... their factorizations can be returned as (nested) lists
str(pkern_var(g_nodata, pars, fac_method='chol', sep=TRUE))
str(pkern_var(g_nodata, pars, fac_method='eigen', sep=TRUE))

# compare to the full covariance matrix factorizations (default sep=FALSE)
str(pkern_var(g_nodata, pars, fac_method='chol'))
str(pkern_var(g_nodata, pars, fac_method='eigen'))

# test quadratic form with X
nX = 3
X_all = cbind(1, matrix(rnorm(nX * n), ncol=nX))
cprod_all = crossprod(X_all, chol2inv(chol(V_full))) %*% X_all
abs(max(pkern_var(g, pars, X=X_all) - cprod_all ))

# test products with inverse of quadratic form with X
mult_test = rnorm(nX+1)
cprod_all_inv = chol2inv(chol(cprod_all))
cprod_all_inv_chol = pkern_var(g, pars, X=X_all, scaled=TRUE, fac_method='eigen')
pkern_var_mult(mult_test, pars, fac=cprod_all_inv_chol) - cprod_all_inv %*% mult_test

# repeat with missing data
X_obs = X_all[idx_obs,]
cprod_obs = crossprod(X_obs, chol2inv(chol(V_obs))) %*% X_obs
abs(max(pkern_var(g_obs, pars, X=X_obs) - cprod_obs ))
cprod_obs_inv = chol2inv(chol(cprod_obs))
cprod_obs_inv_chol = pkern_var(g_obs, pars, X=X_obs, scaled=T, fac_method='eigen')
pkern_var_mult(mult_test, pars, fac=cprod_obs_inv_chol) - cprod_obs_inv %*% mult_test

# `scaled` indicates to divide matrix by psill
print( pars[['eps']]/pars[['psill']] )
diag(pkern_var(g_obs, pars, scaled=TRUE)) # diagonal elements equal to 1 + eps/psill
( pkern_var(g_obs, pars) - psill * pkern_var(g_obs, pars, scaled=TRUE) ) |> abs() |> max()
( pkern_var(g_obs, pars, X=X_obs, scaled=TRUE) - ( cprod_obs/psill ) ) |> abs() |> max()

# in cholesky factor this produces a scaling by square root of psill
max(abs( V_obs_chol - sqrt(psill) * pkern_var(g_obs, pars, fac_method='chol', scaled=TRUE) ))

# and in the eigendecomposition, a scaling of the eigenvalues
vals_scaled = pkern_var(g_obs, pars, fac_method='eigen', scaled=TRUE)$values
max(abs( pkern_var(g_obs, pars, fac_method='eigen')$values - psill*vals_scaled ))

deankoch/pkern documentation built on Oct. 26, 2023, 8:54 p.m.