pkern_var: Generate a covariance matrix or its factorization

View source: R/pkern_var.R

pkern_varR Documentation

Generate a covariance matrix or its factorization

Description

Computes the covariance matrix V (or one of its factorizations) for the non-NA points in grid g_obs, given the model parameters list pars

Usage

pkern_var(
  g_obs,
  pars = NULL,
  scaled = FALSE,
  fac_method = "none",
  X = NULL,
  fac = NULL,
  sep = FALSE
)

Arguments

g_obs

list of form returned by pkern_grid (with entries 'gdim', 'gres', 'gval')

pars

list of form returned by pkern_pars (with entries 'y', 'x', 'eps', 'psill')

scaled

logical, whether to scale by 1/pars$psill

fac_method

character, the factorization to return, one of 'none', 'chol', 'eigen'

X

numeric matrix, the X in t(X) %*% V %*% X (default is identity)

fac

matrix or list of matrices, the variance factorization (only used with X)

sep

logical, indicating to return correlation components instead of full covariance matrix

Details

By default the output matrix is V. Alternatively, if X is supplied, the function returns the quadratic form X^T V^-1 X.

When fac_method=='eigen' the function instead returns the eigen-decomposition of the output matrix, and when fac_method=='chol' its lower triangular Cholesky factor is returned. Supplying this factorization in argument fac in a subsequent call with X can speed up calculations. fac is ignored when X is not supplied.

scaled=TRUE returns the matrix scaled by the reciprocal of the partial sill, 1/pars$psill, before factorization. This is the form expected by functions pkern_var_mult and pkern_LL in argument fac.

If all grid points are observed, then the output V becomes separable. Setting sep=TRUE in this case causes the function to returns the x and y component correlation matrices (or their factorizations, as requested in fac_method) separately, in a list. scaled has no effect in this output mode. Note also that sep has no effect when X is supplied, as the quadratic form is not generally separable (regardless of separability in V^-1).

Missing data are identified by looking for NAs in the data vector g_obs$gval. If all are NA (or if 'gval' is missing from g_obs), the function behaves as though all grid points are observed. For multi-layer input, NAs are instead determined from g_obs$idx_grid and 'gval' is ignored (see ?pkern_grid).

Note: when pars$eps>0, the 'eigen' factorization method will be more robust than 'chol' in handling numerical precision issues with poorly conditioned covariance matrices.

Value

either matrix V, or X^T V^-1 X, or a factorization ('chol' or 'eigen')

Examples

# define example grid with NAs and example predictors matrix
gdim = c(12, 13)
n = prod(gdim)
n_obs = floor(n/3)
idx_obs = sort(sample.int(n, n_obs))
g = g_obs = pkern_grid(gdim)
g_obs$gval[idx_obs] = rnorm(n_obs)

# example kernel
psill = 0.3
pars = pkern_pars(g_obs) |> modifyList(list(psill=psill))

# plot the covariance matrix for observed data, its cholesky factor and eigen-decomposition
V_obs = pkern_var(g_obs, pars)
V_obs_chol = pkern_var(g_obs, pars, fac_method='chol')
V_obs_eigen = pkern_var(g_obs, pars, fac_method='eigen')
pkern_plot(V_obs)
pkern_plot(V_obs_chol)
pkern_plot(V_obs_eigen$vectors)

# case when there are no NAs (or no data at all)
g_nodata = modifyList(g_obs, list(gval=NULL))

# get the full covariance matrix with sep=FALSE (default)
V_full = pkern_var(g_nodata, pars)
max(abs( V_obs - V_full[idx_obs, idx_obs] ))

# get 1d correlation matrices with sep=TRUE...
corr_components = pkern_var(g_nodata, pars, sep=TRUE)
str(corr_components)

# ... these are related to the full covariance matrix by psill and eps
corr_mat = kronecker(corr_components[['x']], corr_components[['y']])
V_full_compare = pars$psill * corr_mat + diag(pars$eps, n)
max(abs(V_full - V_full_compare))

# ... their factorizations can be returned as (nested) lists
str(pkern_var(g_nodata, pars, fac_method='chol', sep=TRUE))
str(pkern_var(g_nodata, pars, fac_method='eigen', sep=TRUE))

# compare to the full covariance matrix factorizations (default sep=FALSE)
str(pkern_var(g_nodata, pars, fac_method='chol'))
str(pkern_var(g_nodata, pars, fac_method='eigen'))

# test quadratic form with X
nX = 3
X_all = cbind(1, matrix(rnorm(nX * n), ncol=nX))
cprod_all = crossprod(X_all, chol2inv(chol(V_full))) %*% X_all
abs(max(pkern_var(g, pars, X=X_all) - cprod_all ))

# test products with inverse of quadratic form with X
mult_test = rnorm(nX+1)
cprod_all_inv = chol2inv(chol(cprod_all))
cprod_all_inv_chol = pkern_var(g, pars, X=X_all, scaled=TRUE, fac_method='eigen')
pkern_var_mult(mult_test, pars, fac=cprod_all_inv_chol) - cprod_all_inv %*% mult_test

# repeat with missing data
X_obs = X_all[idx_obs,]
cprod_obs = crossprod(X_obs, chol2inv(chol(V_obs))) %*% X_obs
abs(max(pkern_var(g_obs, pars, X=X_obs) - cprod_obs ))
cprod_obs_inv = chol2inv(chol(cprod_obs))
cprod_obs_inv_chol = pkern_var(g_obs, pars, X=X_obs, scaled=T, fac_method='eigen')
pkern_var_mult(mult_test, pars, fac=cprod_obs_inv_chol) - cprod_obs_inv %*% mult_test

# `scaled` indicates to divide matrix by psill
print( pars[['eps']]/pars[['psill']] )
diag(pkern_var(g_obs, pars, scaled=TRUE)) # diagonal elements equal to 1 + eps/psill
( pkern_var(g_obs, pars) - psill * pkern_var(g_obs, pars, scaled=TRUE) ) |> abs() |> max()
( pkern_var(g_obs, pars, X=X_obs, scaled=TRUE) - ( cprod_obs/psill ) ) |> abs() |> max()

# in cholesky factor this produces a scaling by square root of psill
max(abs( V_obs_chol - sqrt(psill) * pkern_var(g_obs, pars, fac_method='chol', scaled=TRUE) ))

# and in the eigendecomposition, a scaling of the eigenvalues
vals_scaled = pkern_var(g_obs, pars, fac_method='eigen', scaled=TRUE)$values
max(abs( pkern_var(g_obs, pars, fac_method='eigen')$values - psill*vals_scaled ))


deankoch/pkern documentation built on Oct. 26, 2023, 8:54 p.m.