View source: R/kernel_methods.R
CF | R Documentation |
This function performs control functionals as described in Oates et al (2017).
To choose between different kernels using cross-validation, use CF_crossval
.
CF( integrands, samples, derivatives, steinOrder = NULL, kernel_function = NULL, sigma = NULL, K0 = NULL, est_inds = NULL, one_in_denom = FALSE, diagnostics = FALSE )
integrands |
An N by k matrix of integrands (evaluations of the function of interest) |
samples |
An N by d matrix of samples from the target |
derivatives |
An N by d matrix of derivatives of the log target with respect to the parameters |
steinOrder |
(optional) This is the order of the Stein operator. The default is |
kernel_function |
(optional) Choose between "gaussian", "matern", "RQ", "product" or "prodsim". See below for further details. |
sigma |
(optional) The tuning parameters of the specified kernel. This involves a single length-scale parameter in "gaussian" and "RQ", a length-scale and a smoothness parameter in "matern" and two parameters in "product" and "prodsim". See below for further details. |
K0 |
(optional) The kernel matrix. One can specify either this or all of |
est_inds |
(optional) A vector of indices for the estimation-only samples. The default when |
one_in_denom |
(optional) Whether or not to include a 1 + in the denominator of the control functionals estimator, as in equation 2 on p703 of Oates et al (2017). The 1 + in the denominator is an arbitrary choice so we set it to zero by default. |
diagnostics |
(optional) A flag for whether to return the necessary outputs for plotting or estimating using the fitted model. The default is |
A list with the following elements:
expectation
: The estimate(s) of the (k) expectation(s).
f_true
: (Only if est_inds
is not NULL
) The integrands for the evaluation set. This should be the same as integrands[setdiff(1:N,est_inds),].
f_hat
: (Only if est_inds
is not NULL
) The fitted values for the integrands in the evaluation set. This can be used to help assess the performance of the Gaussian process model.
a
: (Only if diagnostics
= TRUE
) The value of a as described in South et al (2020), where predictions are of the form f_hat = K0*a + 1*b for heldout K0 and estimators using heldout samples are of the form mean(f - f_hat) + b.
b
: (Only if diagnostics
= TRUE
) The value of b as described in South et al (2020), where predictions are of the form f_hat = K0*a + 1*b for heldout K0 and estimators using heldout samples are of the form mean(f - f_hat) + b.
ksd
: (Only if diagnostics
= TRUE
) An estimated kernel Stein discrepancy based on the fitted model that can be used for diagnostic purposes. See South et al (2020) for further details.
bound_const
: (Only if diagnostics
= TRUE
and est_inds
=NULL
) This is such that the absolute error for the estimator should be less than ksd \times bound_const.
Solving the linear system in CF has O(N^3) complexity and is therefore not suited to large N. Using est_inds will instead have an O(N_0^3) cost in solving the linear system and an O((N-N_0)^2) cost in handling the remaining samples, where N_0 is the length of est_inds. This can be much cheaper for large N.
The kernel in Stein-based kernel methods is L_x L_y k(x,y) where L_x is a first or second order Stein operator in x and k(x,y) is some generic kernel to be specified.
The Stein operators for distribution p(x) are defined as:
steinOrder=1
: L_x g(x) = \nabla_x^T g(x) + \nabla_x \log p(x)^T g(x) (see e.g. Oates el al (2017))
steinOrder=2
: L_x g(x) = Δ_x g(x) + \nabla_x log p(x)^T \nabla_x g(x) (see e.g. South el al (2020))
Here \nabla_x is the first order derivative wrt x and Δ_x = \nabla_x^T \nabla_x is the Laplacian operator.
The generic kernels which are implemented in this package are listed below. Note that the input parameter sigma
defines the kernel parameters σ.
"gaussian"
: A Gaussian kernel,
k(x,y) = exp(-z(x,y)/σ^2)
"matern"
: A Matern kernel with σ = (λ,ν),
k(x,y) = bc^{ν}z(x,y)^{ν/2}K_{ν}(c z(x,y)^{0.5})
where b=2^{1-ν}(Γ(ν))^{-1}, c=(2ν)^{0.5}λ^{-1} and K_{ν}(x) is the modified Bessel function of the second kind. Note that λ is the length-scale parameter and ν is the smoothness parameter (which defaults to 2.5 for steinOrder=1 and 4.5 for steinOrder=2).
"RQ"
: A rational quadratic kernel,
k(x,y) = (1+σ^{-2}z(x,y))^{-1}
"product"
: The product kernel that appears in Oates et al (2017) with σ = (a,b)
k(x,y) = (1+a z(x) + a z(y))^{-1} exp(-0.5 b^{-2} z(x,y))
"prodsim"
: A slightly different product kernel with σ = (a,b) (see e.g. https://www.imperial.ac.uk/inference-group/projects/monte-carlo-methods/control-functionals/),
k(x,y) = (1+a z(x))^{-1}(1 + a z(y))^{-1} exp(-0.5 b^{-2} z(x,y))
In the above equations, z(x) = ∑_j x[j]^2 and z(x,y) = ∑_j (x[j] - y[j])^2. For the last two kernels, the code only has implementations for steinOrder
=1
. Each combination of steinOrder
and kernel_function
above is currently hard-coded but it may be possible to extend this to other kernels in future versions using autodiff. The calculations for the first three kernels above are detailed in South et al (2020).
Leah F. South
Oates, C. J., Girolami, M. & Chopin, N. (2017). Control functionals for Monte Carlo integration. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 79(3), 695-718.
South, L. F., Karvonen, T., Nemeth, C., Girolami, M. and Oates, C. J. (2020). Semi-Exact Control Functionals From Sard's Method. https://arxiv.org/abs/2002.00033
See ZVCV for examples and related functions. See CF_crossval
for a function to choose between different kernels for this estimator.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.