ZVCV_package: Zero-Variance Control Variates

Description Helper functions Evidence estimation Author(s) References See Also Examples

Description

This package can be used to perform post-hoc variance reduction of Monte Carlo estimators when the derivatives of the log target are available. The main functionality is available through the following functions. All of these use a set of N d-dimensional samples along with the associated derivatives of the log target. You can evaluate posterior expectations of k functions.

ZV-CV is exact for polynomials of order at most polyorder under Gaussian targets and is fast for large N (although setting a limit on polyorder through polyorder_max is recommended for large N). CF is a non-parametric approach that offers better than the standard Monte Carlo convergence rates. SECF has both a parametric and a non-parametric component and it offers the advantages of both for an additional computational cost. The cost of SECF is reduced in aSECF using nystrom approximations and conjugate gradient.

Helper functions

Evidence estimation

The following functions are used to estimate the evidence (the normalisiing constant of the posterior) as described in South et al (2018). They are relevant when sequential Monte Carlo with an annealing schedule has been used to collect the samples, and therefore are not of interest to those who are interested in variance reduction based on vanilla MCMC.

The function Expand_Temperatures can be used to adjust the temperature schedule so that it is more (or less) strict than the original schedule of T temperatures.

Author(s)

Leah F. South

References

Mira, A., Solgi, R., & Imparato, D. (2013). Zero variance Markov chain Monte Carlo for Bayesian estimators. Statistics and Computing, 23(5), 653-662.

South, L. F., Karvonen, T., Nemeth, C., Girolami, M. and Oates, C. J. (2020). Semi-Exact Control Functionals From Sard's Method. https://arxiv.org/abs/2002.00033

South, L. F., Oates, C. J., Mira, A., & Drovandi, C. (2018). Regularised zero-variance control variates for high-dimensional variance reduction. https://arxiv.org/abs/1811.05073

See Also

Useful links:

Examples

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
# A real data example using ZV-CV is available at \link{VDP}.
# This involves estimating posterior expectations and the evidence from SMC samples.

# The remainder of this section is duplicating (albeit with a different random
# seed) Figure 2a of South et al. (2020).

N_repeats <- 2 # For speed, the actual code uses 100 
N_all <- 25 # For speed, the actual code uses c(10,25,50,100,250,500,1000) 
sigma_list <- list(10^(-1.5),10^(-1),10^(-0.5),1,10^(0.5),10)
nfolds <- 4 # For speed, the actual code uses 10
folds <- 2 # For speed, the actual code uses 5
d <- 4

integrand_fn <- function(x){
  return (1 + x[,2] + 0.1*x[,1]*x[,2]*x[,3] + sin(x[,1])*exp(-(x[,2]*x[,3])^2))
}

results <- data.frame()
for (N in N_all){

  # identify the largest polynomial order that can be fit without regularisation for auto ZV-CV
  max_r <- 0
  while (choose(d + max_r + 1,d)<((folds-1)/folds*N)){
  	max_r <- max_r + 1
  }

  MC <- ZV1 <- ZV2 <- ZVchoose <- rep(NaN,N_repeats)
  CF <- SECF1 <- aSECF1 <- SECF2 <- aSECF2 <- rep(NaN,N_repeats)
  CF_medHeur <- SECF1_medHeur <- aSECF1_medHeur <- rep(NaN,N_repeats)
  SECF2_medHeur <- aSECF2_medHeur <- rep(NaN,N_repeats)
  for (i in 1:N_repeats){     
    x <- matrix(rnorm(N*d),ncol=d)
    u <- -x
    f <- integrand_fn(x)
    
    MC[i] <- mean(f)
    ZV1[i] <- zvcv(f,x,u,options=list(polyorder=1,regul_reg=FALSE))$expectation
    # Checking if the sample size is large enough to accommodation a second order polynomial
    if (N > choose(d+2,d)){
      ZV2[i] <- zvcv(f,x,u,options=list(polyorder=2,regul_reg=FALSE))$expectation
    }
    myopts <- list(list(polyorder=Inf,regul_reg=FALSE,polyorder_max=max_r),
        list(polyorder=Inf,nfolds=nfolds))
    ZVchoose[i] <- zvcv(f,x,u,options=myopts,folds = folds)$expectation
    
    # Calculating the kernel matrix in advance for CF and SECF
    K0_list <- list()
    for (j in 1:length(sigma_list)){
      K0_list[[j]] <- K0_fn(x,u,sigma_list[[j]],steinOrder=2,kernel_function="RQ")
    }
    
    CF[i] <- CF_crossval(f,x,u,K0_list=K0_list,folds = folds)$expectation
    SECF1[i] <- SECF_crossval(f,x,u,K0_list=K0_list,folds = folds)$expectation
    aSECF1[i] <- aSECF_crossval(f,x,u,steinOrder=2,kernel_function="RQ",
        sigma_list=sigma_list,reltol=1e-05,folds = folds)$expectation
    if (max_r>=2){
      SECF2[i] <- SECF_crossval(f,x,u,polyorder=2,K0_list=K0_list,folds = folds)$expectation
      aSECF2[i] <- aSECF_crossval(f,x,u,polyorder=2,steinOrder=2,kernel_function="RQ",
          sigma_list=sigma_list,reltol=1e-05,folds = folds)$expectation
    }

    medHeur <- medianTune(x)
    K0_medHeur <- K0_fn(x,u,medHeur,steinOrder=2,kernel_function="RQ")
    CF_medHeur[i] <- CF(f,x,u,K0=K0_medHeur)$expectation
    SECF1_medHeur[i] <- SECF(f,x,u,K0=K0_medHeur)$expectation
    aSECF1_medHeur[i] <- aSECF(f,x,u,steinOrder=2,kernel_function="RQ",
          sigma=medHeur,reltol=1e-05)$expectation
    if (max_r>=2){
      SECF2_medHeur[i] <- SECF(f,x,u,polyorder=2,K0=K0_medHeur)$expectation
      aSECF2_medHeur[i] <- aSECF(f,x,u,polyorder=2,steinOrder=2,kernel_function="RQ",
          sigma=medHeur,reltol=1e-05)$expectation
    }
    
    # print(sprintf("--%d",i))
  }
  # Adding the results to a data frame
  MSE_crude <- mean((MC - 1)^2)
  results <- rbind(results,data.frame(N=N, order = "1 or NA",
      efficiency = 1, type = "MC")) 
  results <- rbind(results,data.frame(N=N, order = "1 or NA",
      efficiency = MSE_crude/mean((ZV1 - 1)^2), type = "ZV")) 
  results <- rbind(results,data.frame(N=N, order = "2",
      efficiency = MSE_crude/mean((ZV2 - 1)^2), type = "ZV")) 
  results <- rbind(results,data.frame(N=N, order = "1 or NA",
      efficiency = MSE_crude/mean((ZVchoose - 1)^2), type = "ZVchoose")) 
  results <- rbind(results,data.frame(N=N, order = "1 or NA",
      efficiency = MSE_crude/mean((CF - 1)^2), type = "CF")) 
  results <- rbind(results,data.frame(N=N, order = "1 or NA",
      efficiency = MSE_crude/mean((SECF1 - 1)^2), type = "SECF")) 
  results <- rbind(results,data.frame(N=N, order = "1 or NA",
      efficiency = MSE_crude/mean((aSECF1 - 1)^2), type = "aSECF")) 
  if (((folds-1)/folds*N) > choose(d+2,d)){
    results <- rbind(results,data.frame(N=N, order = "2",
      efficiency = MSE_crude/mean((SECF2 - 1)^2), type = "SECF")) 
    results <- rbind(results,data.frame(N=N, order = "2",
      efficiency = MSE_crude/mean((aSECF2 - 1)^2), type = "aSECF")) 
  }

  results <- rbind(results,data.frame(N=N, order = "1 or NA",
      efficiency = MSE_crude/mean((CF_medHeur - 1)^2), type = "CF_medHeur")) 
  results <- rbind(results,data.frame(N=N, order = "1 or NA",
      efficiency = MSE_crude/mean((SECF1_medHeur - 1)^2), type = "SECF_medHeur")) 
  results <- rbind(results,data.frame(N=N, order = "1 or NA",
      efficiency = MSE_crude/mean((aSECF1_medHeur - 1)^2), type = "aSECF_medHeur")) 
  if (((folds-1)/folds*N) > choose(d+2,d)){
    results <- rbind(results,data.frame(N=N, order = "2",
      efficiency = MSE_crude/mean((SECF2_medHeur - 1)^2), type = "SECF_medHeur")) 
    results <- rbind(results,data.frame(N=N, order = "2",
      efficiency = MSE_crude/mean((aSECF2_medHeur - 1)^2), type = "aSECF_medHeur")) 
  }
  # print(N)
}


## Not run: 
# Plotting results where cross-validation is used for kernel methods
require(ggplot2)
require(ggthemes)
a <- ggplot(data=subset(results,!(type %in% c("CF_medHeur","SECF_medHeur",
  "aSECF_medHeur","SECF_medHeur","aSECF_medHeur"))),
  aes(x=N,y=efficiency,col=type,linetype=order)) + scale_color_pander() + 
  ggtitle("") + geom_line(size=1.5) + scale_x_log10() + scale_y_log10() + 
  annotation_logticks(base=10) + labs(x="N",y="Efficiency",color="Method",
  linetype="Polynomial Order") + theme_minimal(base_size = 15) +
  theme(legend.key.size = unit(0.5, "cm"),legend.key.width =  unit(1, "cm")) +
  guides(linetype = guide_legend(override.aes = list(size=1),title.position = "top"),
  color = guide_legend(override.aes = list(size=1),title.position = "top"))
print(a)


# Plotting results where the median heuristic is used for kernel methods
b <- ggplot(data=subset(results,!(type %in% c("CF","SECF","aSECF","SECF","aSECF"))),
            aes(x=N,y=efficiency,col=type,linetype=order)) + scale_color_pander() + 
  ggtitle("") + geom_line(size=1.5) + scale_x_log10() + scale_y_log10() + 
  annotation_logticks(base=10) + labs(x="N",y="Efficiency",color="Method",
  linetype="Polynomial Order") + theme_minimal(base_size = 15) +
  theme(legend.key.size = unit(0.5, "cm"),legend.key.width =  unit(1, "cm")) +
  guides(linetype = guide_legend(override.aes = list(size=1),title.position = "top"),
  color = guide_legend(override.aes = list(size=1),title.position = "top"))
print(b)

## End(Not run)

ZVCV documentation built on July 2, 2020, 2:38 a.m.

Related to ZVCV_package in ZVCV...