pkern_LL: Likelihood of covariance model 'pars' given data 'g_obs'
In deankoch/pkern: Fast Kriging with Product Kernels

pkern_LL

R Documentation

Likelihood of covariance model `pars` given data `g_obs`

Description

Computes the log-likelihood for the Gaussian process covariance model pars, given 2-dimensional grid data g_obs, and, optionally, linear trend data in X.

Usage

pkern_LL(
  pars,
  g_obs,
  X = 0,
  fac_method = "chol",
  fac = NULL,
  quiet = TRUE,
  more = FALSE
)

Arguments

`pars`	list of form returned by `pkern_pars` (with entries 'y', 'x', 'eps', 'psill')
`g_obs`	list of form returned by `pkern_grid` (with entries 'gdim', 'gres', 'gval')
`X`	numeric, vector, matrix, or NA, a fixed mean value, or matrix of linear predictors
`fac_method`	character, the factorization to use: 'chol' (default) or 'eigen'
`fac`	matrix or list, (optional) pre-computed covariance factorization
`quiet`	logical indicating to suppress console output
`more`	logical, indicates to return list with likelihood components

Details

The function evaluates:

-log( 2 * pi ) - ( 1/2 ) * ( log_det + quad_form ),

where log_det is the logarithm of the determinant of covariance matrix V, and quad_form is z^T V^-1 z, for the observed response vector z. This z is constructed by subtracting the trend specified in X (if any) from the non-NA values in g_obs$gval.

If the trend is known, it can be supplied in argument X as a numeric scalar or vector of length equal to the number of non-NA values in g_obs$gval, in matching order. Equivalently, users can simply subtract the trend from g_obs beforehand and set X=0 (the default). If the trend is unknown, the function optionally estimates it by GLS using the model pars. To estimate a spatially constant mean, set X=NA. To estimate a spatially variable mean, supply linear predictors as columns of a matrix argument to X (see pkern_GLS).

fac_method specifies how to factorize V, either using the Cholesky factor ('chol') or eigen-decomposition ('eigen'). A pre-computed factorization fac can be supplied by first calling pkern_var(..., scaled=TRUE) (in which case fac_method is ignored).

When more=TRUE, the function returns a list of diagnostics: a count of the number of observations, the likelihood function value, and its two major components; the log-determinant log_det, and the quadratic form quad_form.

Value

numeric, the likelihood of pars given g_obs and X, or list (if more=TRUE)

Examples

# set up example grid, covariance parameters
gdim = c(25, 12)
n = prod(gdim)
g_all = pkern_grid(gdim)
pars = modifyList(pkern_pars(g_all, 'gau'), list(psill=0.7, eps=5e-2))

# generate some covariates and complete data
n_betas = 3
betas = rnorm(n_betas)
X_all = cbind(1, matrix(rnorm(n*(n_betas-1)), n))
z = as.vector( pkern_sim(g_all) + (X_all %*% betas) )
g_all[['gval']] = z

# two methods for likelihood
LL_chol = pkern_LL(pars, g_all, fac_method='chol')
LL_eigen = pkern_LL(pars, g_all, fac_method='eigen')

# compare to working directly with matrix inverse
V = pkern_var(g_all, pars, fac_method='none', sep=FALSE)
V_inv = chol2inv(chol(V))
quad_form = as.numeric( t(z) %*% crossprod(V_inv, z) )
log_det = as.numeric( determinant(V, logarithm=TRUE) )[1]
LL_direct = (-1/2) * ( n * log( 2 * pi ) + log_det + quad_form )

# relative errors
abs( LL_direct - LL_chol ) / max(LL_direct, LL_chol)
abs( LL_direct - LL_eigen ) / max(LL_direct, LL_eigen)

# repeat with pre-computed variance factorization
fac_eigen = pkern_var(g_all, pars, fac_method='eigen', sep=TRUE)
pkern_LL(pars, g_all, fac=fac_eigen) - LL_eigen

# repeat with most data missing
n_obs = 50
idx_obs = sort(sample.int(n, n_obs))
z_obs = g_all$gval[idx_obs]
g_obs = modifyList(g_all, list(gval=rep(NA, n)))
g_obs[['gval']][idx_obs] = z_obs
LL_chol_obs = pkern_LL(pars, g_obs, fac_method='chol')
LL_eigen_obs = pkern_LL(pars, g_obs, fac_method='eigen')

# working directly with matrix inverse
V_obs = pkern_var(g_obs, pars, fac_method='none')
V_obs_inv = chol2inv(chol(V_obs))
quad_form_obs = as.numeric( t(z_obs) %*% crossprod(V_obs_inv, z_obs) )
log_det_obs = as.numeric( determinant(V_obs, logarithm=TRUE) )[1]
LL_direct_obs = (-1/2) * ( n_obs * log( 2 * pi ) + log_det_obs + quad_form_obs )
abs( LL_direct_obs - LL_chol_obs ) / max(LL_direct_obs, LL_chol_obs)
abs( LL_direct_obs - LL_eigen_obs ) / max(LL_direct_obs, LL_eigen_obs)

# again using a pre-computed variance factorization
fac_chol_obs = pkern_var(g_obs, pars, fac_method='chol', scaled=TRUE)
fac_eigen_obs = pkern_var(g_obs, pars, fac_method='eigen', scaled=TRUE)
pkern_LL(pars, g_obs, fac=fac_chol_obs) - LL_chol_obs
pkern_LL(pars, g_obs, fac=fac_eigen_obs) - LL_eigen_obs

# copy covariates (don't pass the intercept column in X)
X = X_all[idx_obs, -1]

# use GLS to de-trend, with and without covariatea
g_detrend_obs = g_detrend_obs_X = g_obs
g_detrend_obs[['gval']][idx_obs] = z_obs - pkern_GLS(g_obs, pars)
g_detrend_obs_X[['gval']][idx_obs] = z_obs - pkern_GLS(g_obs, pars, X, out='z')

# pass X (or NA) to pkern_LL to do this automatically
LL_detrend_obs = pkern_LL(pars, g_detrend_obs)
LL_detrend_obs_X = pkern_LL(pars, g_detrend_obs_X)
LL_detrend_obs - pkern_LL(pars, g_obs, X=NA)
LL_detrend_obs_X - pkern_LL(pars, g_obs, X=X)

# equivalent sparse input specification
idx_grid = match(seq(n), idx_obs)
g_sparse = modifyList(g_all, list(gval=matrix(z_obs, ncol=1), idx_grid=idx_grid))
LL_chol_obs - pkern_LL(pars, g_sparse)
LL_eigen_obs - pkern_LL(pars, g_sparse)
LL_detrend_obs - pkern_LL(pars, g_sparse, X=NA)
LL_detrend_obs_X - pkern_LL(pars, g_sparse, X=X)

# repeat with complete data

# (don't pass the intercept column in X)
X = X_all[,-1]
LL_X_chol = pkern_LL(pars, g_all, X=X)
LL_X_eigen = pkern_LL(pars, g_all, fac_method='eigen', X=X)
z_obs = g_all$gval[!is.na(g_all$gval)]
#z_mat = matrix(z_obs, ncol=1)
V = pkern_var(g_all, pars, sep=FALSE)
V_inv = chol2inv(chol(V))
X_tilde_inv = chol2inv(chol( crossprod(crossprod(V_inv, X_all), X_all) ))
betas_gls = X_tilde_inv %*% crossprod(X_all, (V_inv %*% z_obs))
z_gls = z_obs - (X_all %*% betas_gls)
z_gls_trans = crossprod(V_inv, z_gls)
quad_form = as.numeric( t(z_gls) %*% z_gls_trans )
log_det = as.numeric( determinant(V, logarithm=TRUE) )[1]
LL_direct = (-1/2) * ( n * log( 2 * pi ) + log_det + quad_form )
abs( LL_direct - LL_X_chol ) / max(LL_direct, LL_X_chol)
abs( LL_direct - LL_X_eigen ) / max(LL_direct, LL_X_eigen)

# return components of likelihood with more=TRUE
LL_result = pkern_LL(pars, g_all, X=X, more=TRUE)
LL_result$LL - LL_X_chol
LL_result$q - quad_form
LL_result$d - log_det
LL_result$n - n

deankoch/pkern documentation built on Oct. 26, 2023, 8:54 p.m.