Description Usage Arguments Details Value Author(s) References See Also Examples

View source: R/marginalLikelihood.R

Calcluated the marginal likelihood from a set of MCMC samples

1 | ```
marginalLikelihood(sampler, numSamples = 1000, method = "Chib", ...)
``` |

`sampler` |
an MCMC or SMC sampler or list, or for method "Prior" also a BayesianSetup |

`numSamples` |
number of samples to use. How this works, and if it requires recalculating the likelihood, depends on the method |

`method` |
method to choose. Currently available are "Chib" (default), the harmonic mean "HM", sampling from the prior "Prior", and bridge sampling "Bridge". See details |

`...` |
further arguments passed to |

The marginal likelihood is the average likelihood across the prior space. It is used, for example, for Bayesian model selection and model averaging.

It is defined as

*ML = \int L(Θ) p(Θ) dΘ*

Given that MLs are calculated for each model, you can get posterior weights (for model selection and/or model averaging) on the model by

*P(M_i|D) = ML_i * p(M_i) / (∑_i ML_i * p(M_i) )*

In BT, we return the log ML, so you will have to exp all values for this formula.

It is well-known that the ML is VERY dependent on the prior, and in particular the choice of the width of uninformative priors may have major impacts on the relative weights of the models. It has therefore been suggested to not use the ML for model averaging / selection on uninformative priors. If you have no informative priors, and option is to split the data into two parts, use one part to generate informative priors for the model, and the second part for the model selection. See help for an example.

The marginalLikelihood function currently implements four ways to calculate the marginal likelihood. Be aware that marginal likelihood calculations are notoriously prone to numerical stability issues. Especially in high-dimensional parameter spaces, there is no guarantee that any of the implemented algorithms will converge reasonably fast. The recommended (and default) method is the method "Chib" (Chib and Jeliazkov, 2001), which is based on MCMC samples, with a limited number of additional calculations. Despite being the current recommendation, note there are some numeric issues with this algorithm that may limit reliability for larger dimensions.

The harmonic mean approximation, is implemented only for comparison. Note that the method is numerically unrealiable and usually should not be used.

The third method is simply sampling from the prior. While in principle unbiased, it will only converge for a large number of samples, and is therefore numerically inefficient.

The Bridge method uses bridge sampling as implemented in the R package "bridgesampling". It is potentially more exact than the Chib method, but might require more computation time. However, this may be very dependent on the sampler.

A list with log of the marginal likelihood, as well as other diagnostics depending on the chose method

Florian Hartig

Chib, Siddhartha, and Ivan Jeliazkov. "Marginal likelihood from the Metropolis-Hastings output." Journal of the American Statistical Association 96.453 (2001): 270-281.

Dormann et al. 2018. Model averaging in ecology: a review of Bayesian, information-theoretic, and tactical approaches for predictive inference. Ecological Monographs

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 | ```
##############################################################
# Comparison of ML for two regression models
# Creating test data with quadratic relationship
sampleSize = 30
x <- (-(sampleSize-1)/2):((sampleSize-1)/2)
y <- 1 * x + 1*x^2 + rnorm(n=sampleSize,mean=0,sd=10)
# plot(x,y, main="Test Data")
# likelihoods for linear and quadratic model
likelihood1 <- function(param){
pred = param[1] + param[2]*x + param[3] * x^2
singlelikelihoods = dnorm(y, mean = pred, sd = 1/(param[4]^2), log = TRUE)
return(sum(singlelikelihoods))
}
likelihood2 <- function(param){
pred = param[1] + param[2]*x
singlelikelihoods = dnorm(y, mean = pred, sd = 1/(param[3]^2), log = TRUE)
return(sum(singlelikelihoods))
}
setUp1 <- createBayesianSetup(likelihood1,
lower = c(-5,-5,-5,0.01),
upper = c(5,5,5,30))
setUp2 <- createBayesianSetup(likelihood2,
lower = c(-5,-5,0.01),
upper = c(5,5,30))
out1 <- runMCMC(bayesianSetup = setUp1)
M1 = marginalLikelihood(out1, start = 1000)
out2 <- runMCMC(bayesianSetup = setUp2)
M2 = marginalLikelihood(out2, start = 1000)
### Calculating Bayes factor
exp(M1$ln.ML - M2$ln.ML)
# BF > 1 means the evidence is in favor of M1. See Kass, R. E. & Raftery, A. E.
# (1995) Bayes Factors. J. Am. Stat. Assoc., Amer Statist Assn, 90, 773-795.
### Calculating Posterior weights
exp(M1$ln.ML) / ( exp(M1$ln.ML) + exp(M2$ln.ML))
# If models have different model priors, multiply with the prior probabilities of each model.
## Not run:
#############################################################
# Fractional Bayes factor
# Motivation: ML is very dependent on the prior, which is a problem if you
# have uninformative priors. you can see this via rerunning the upper
# example with changed priors - suddenly, support for M1 is gone
setUp1 <- createBayesianSetup(likelihood1,
lower = c(-500,-500,-500,0.01),
upper = c(500,500,500,3000))
setUp2 <- createBayesianSetup(likelihood2,
lower = c(-500,-500,0.01),
upper = c(500,500,3000))
out1 <- runMCMC(bayesianSetup = setUp1)
M1 = marginalLikelihood(out1, start = 1000)
out2 <- runMCMC(bayesianSetup = setUp2)
M2 = marginalLikelihood(out2, start = 1000)
### Calculating Bayes factor
exp(M1$ln.ML - M2$ln.ML)
# it has therefore been suggested that ML should not be calculated on uninformative priors. But
# what to do if there are no informative priors?
# one option is to calculate the fractional BF, which means that one splites the data in half,
# uses the first half to fit the model, and then use the posterior as a new (now informative)
# prior for the ML - let's do this for the previous case
# likelihoods with half the data
likelihood1 <- function(param){
pred = param[1] + param[2]*x + param[3] * x^2
singlelikelihoods = dnorm(y, mean = pred, sd = 1/(param[4]^2), log = TRUE)
return(sum(singlelikelihoods[seq(1, 30, 2)]))
}
likelihood2 <- function(param){
pred = param[1] + param[2]*x
singlelikelihoods = dnorm(y, mean = pred, sd = 1/(param[3]^2), log = TRUE)
return(sum(singlelikelihoods[seq(1, 30, 2)]))
}
setUp1 <- createBayesianSetup(likelihood1,
lower = c(-500,-500,-500,0.01),
upper = c(500,500,500,3000))
setUp2 <- createBayesianSetup(likelihood2,
lower = c(-500,-500,0.01),
upper = c(500,500,3000))
out1 <- runMCMC(bayesianSetup = setUp1)
out2 <- runMCMC(bayesianSetup = setUp2)
newPrior1 = createPriorDensity(out1, start = 200,
lower = c(-500,-500,-500,0.01),
upper = c(500,500,500,3000))
newPrior2 = createPriorDensity(out2, start = 200,
lower = c(-500,-500,0.01),
upper = c(500,500,3000))
# now rerun this with likelihoods for the other half of the data and new prior
likelihood1 <- function(param){
pred = param[1] + param[2]*x + param[3] * x^2
singlelikelihoods = dnorm(y, mean = pred, sd = 1/(param[4]^2), log = TRUE)
return(sum(singlelikelihoods[seq(2, 30, 2)]))
}
likelihood2 <- function(param){
pred = param[1] + param[2]*x
singlelikelihoods = dnorm(y, mean = pred, sd = 1/(param[3]^2), log = TRUE)
return(sum(singlelikelihoods[seq(2, 30, 2)]))
}
setUp1 <- createBayesianSetup(likelihood1, prior = newPrior1)
setUp2 <- createBayesianSetup(likelihood2, prior = newPrior2)
out1 <- runMCMC(bayesianSetup = setUp1)
M1 = marginalLikelihood(out1, start = 1000)
out2 <- runMCMC(bayesianSetup = setUp2)
M2 = marginalLikelihood(out2, start = 1000)
### Calculating the fractional Bayes factor
exp(M1$ln.ML - M2$ln.ML)
## End(Not run)
############################################################
### Performance comparison ###
# Low dimensional case with narrow priors - all methods have low error
# we use a truncated normal for the likelihood to make sure that the density
# integrates to 1 - makes it easier to calcuate the theoretical ML
likelihood <- function(x) sum(msm::dtnorm(x, log = TRUE, lower = -1, upper = 1))
prior = createUniformPrior(lower = rep(-1,2), upper = rep(1,2))
bayesianSetup <- createBayesianSetup(likelihood = likelihood, prior = prior)
out = runMCMC(bayesianSetup = bayesianSetup, settings = list(iterations = 5000))
# plot(out)
# theoretical value
theory = log(1/(2^2))
marginalLikelihood(out)$ln.ML - theory
marginalLikelihood(out, method = "Prior", numSamples = 500)$ln.ML - theory
marginalLikelihood(out, method = "HM", numSamples = 500)$ln.ML - theory
marginalLikelihood(out, method = "Bridge", numSamples = 500)$ln.ML - theory
# higher dimensions - wide prior - HM and Prior don't work
likelihood <- function(x) sum(msm::dtnorm(x, log = TRUE, lower = -10, upper = 10))
prior = createUniformPrior(lower = rep(-10,3), upper = rep(10,3))
bayesianSetup <- createBayesianSetup(likelihood = likelihood, prior = prior)
out = runMCMC(bayesianSetup = bayesianSetup, settings = list(iterations = 5000))
# plot(out)
# theoretical value
theory = log(1/(20^3))
marginalLikelihood(out)$ln.ML - theory
marginalLikelihood(out, method = "Prior", numSamples = 500)$ln.ML - theory
marginalLikelihood(out, method = "HM", numSamples = 500)$ln.ML - theory
marginalLikelihood(out, method = "Bridge", numSamples = 500)$ln.ML - theory
``` |

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.