# VDP: Example of estimation using SMC In ZVCV: Zero-Variance Control Variates

## Description

This example illustrates how ZV-CV can be used for post-processing of results from likelihood-annealing SMC. In particular, we use ZV-CV to estimate posterior expectations and the evidence for a single SMC run of this example based on the Van der Pol oscillatory differential equations (Van der Pol, 1926). Further details about this example and applications to ZV-CV can be found in Oates et al. (2017) and South et al. (2019).

Given that the focus of this R package is on ZV-CV, we assume that samples have already been obtained from SMC and put into the correct format. One could use the R package `RcppSMC` or implement their own sampler in order to obtain results like this. The key is to make sure the derivatives of the log likelihood and log prior are stored, along with the inverse temperatures.

## Usage

 `1` ```data(VDP) ```

## Format

A list containing the following :

N

The size of the SMC population

rho

The tolerance for the new temperatures, which are selected so that the CESS at each temperature is ρ*N where N is the population size.

temperatures

A vector of length T of inverse power posterior temperatures

samples

An N by d by T matrix of samples from the T power posteriors, where d is the dimension of the target distribution. The samples are transformed to be on the log scale and all derivatives are with respect to log samples.

loglike

An N by T matrix of log likelihood values corresponding to `samples`

logprior

An N by T matrix of log prior values corresponding to `samples`

der_loglike

An N by d by T matrix of the derivatives of the log likelihood with respect to the parameters, with parameter values corresponding to `samples`

der_logprior

An N by d by T matrix of the derivatives of the log prior with respect to the parameters, with parameter values corresponding to `samples`

## References

Oates, C. J., Girolami, M. & Chopin, N. (2017). Control functionals for Monte Carlo integration. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 79(3), 695-718.

South, L. F., Oates, C. J., Mira, A., & Drovandi, C. (2019). Regularised zero-variance control variates for high-dimensional variance reduction.

Van der Pol, B. (1926). On relaxation-oscillations. The London, Edinburgh and Dublin Philosophical Magazine and Journal of Science, 2(11), 978-992.

 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57``` ```set.seed(1) # Load the SMC results data(VDP) # Set up the list of control variates to choose from options <- list() # Vanilla Monte Carlo options[] <- list(polyorder = 0) # Standard ZV-CV with polynomial order selected through cross-validation options[] <- list(polyorder = Inf, regul_reg = FALSE) ############################## # Posterior expectation - The true expectation is 0.9852 to 4 decimal places ############################## # Note the exp() because samples and derivatives were stored on the log scale # but we are interested in the expectation on the original scale posterior <- zvcv(exp(VDP\$samples[,,8]), VDP\$samples[,,8], VDP\$der_loglike[,,8] + VDP\$der_logprior[,,8], options = options) posterior\$expectation # The posterior expectation estimate posterior\$polyorder # The selected polynomial order ############################## # Evidence estimation - The true logged evidence is 10.36 to 2 decimal places ############################## # Getting additional temperatures based on maintaing a CESS of 0.91N rather than 0.9N. # The value 0.91 is used for speed but South et al. (2019) use 0.99. temp <- Expand_Temperatures(VDP\$temperatures, VDP\$loglike, 0.91) VDP\$temperatures_new <- temp\$temperatures_all # the new temperatures VDP\$most_recent <- temp\$relevant_samples # the samples associated with the new temperatures n_sigma <- 3 # For speed, South et al. (2019) uses 15 sigma_list <- as.list( 10^(0.5*seq(-3,4,length.out=n_sigma)) ) # Evidence estimation using the SMC identity Z_SMC <- evidence_SMC(VDP\$samples, VDP\$loglike, VDP\$der_loglike, VDP\$der_logprior, VDP\$temperatures, VDP\$temperatures_new, VDP\$most_recent, options = options) Z_SMC\$log_evidence # Evidence estimation using the SMC identity Z_SMC_CF <- evidence_SMC_CF(VDP\$samples, VDP\$loglike, VDP\$der_loglike, VDP\$der_logprior, VDP\$temperatures, VDP\$temperatures_new, VDP\$most_recent, steinOrder = 2, kernel_function = "gaussian", sigma_list = sigma_list, folds = 2) Z_SMC_CF\$log_evidence # Evidence estimation using the CTI identity Z_CTI <- evidence_CTI(VDP\$samples, VDP\$loglike, VDP\$der_loglike, VDP\$der_logprior, VDP\$temperatures, VDP\$temperatures_new, VDP\$most_recent, options = options) Z_CTI\$log_evidence_PS2 # Evidence estimation using the CTI identity Z_CTI_CF <- evidence_CTI_CF(VDP\$samples, VDP\$loglike, VDP\$der_loglike, VDP\$der_logprior, VDP\$temperatures, VDP\$temperatures_new, VDP\$most_recent, steinOrder = 2, kernel_function = "gaussian", sigma_list = sigma_list, folds = 2) Z_CTI_CF\$log_evidence_PS2 ```