knitr::opts_chunk$set( collapse = TRUE, comment = "#>", out.width = "80%", warning = FALSE, fig.width = 7, fig.height = 5 )

One of the most significant benefits of the `fcmc`

package is that it
allows creating personalized kernel functions. `fmcmc_kernel`

objects are built
with the `kernel_new`

function (a function factory) and require (at least) one
parameter, the `proposal`

function. In what follows, we show an example where
the user specifies a transition kernel used to estimate an integer parameter,
the `n`

parameter of a binomial distribution.

Imagine that we are interested in learning the size of a population given an observed proportion. In this case, we already know about the prevalence of a disease. Furthermore, assume that 20\% of the individuals acquire this disease, and we have a random sample from the population $y \sim \mbox{Binomial}(0.2, N)$. We don't know $N$.

Such a scenario, while perhaps a bit uncommon, needs special treatment in a Bayesian/MCMC framework. The parameter to estimate is not continuous, so we would like to draw samples from a discrete distribution. Using the "normal" (pun intended) transition kernel may still be able to estimate something but does not provide us with the correct posterior distribution. In this case, a transition kernel that makes discrete proposals would be desired.

Let's simulate some data, say, 300 observations from this Binomial random variable with parameters $p = .2$ and $N = 500$:

library(fmcmc) set.seed(1) # Always set the seed!!!! # Population parameters p <- .2 N <- 500 y <- rbinom(300, size = N, prob = p)

Our goal is to be able to estimate the parameter $N$. As in any MCMC function, we need to define the log-likelihood function:

ll <- function(n., p.) { sum(dbinom(y, n., prob = p., log = TRUE)) }

Now comes the kernel object. In order to create an `fmcmc_kernel`

, we can use
the helper function `kernel_new`

as follows:

kernel_unif_int <- kernel_new( proposal = function(env) env$theta0 + sample(-3:3, 1), logratio = function(env) env$f1 - env$f0 # We could have skipped this )

Here, the kernel is in the form of
$\theta_1 = \theta_0 + R, R\sim \mbox{U}{-3, ..., 3}$, this is, proposals are
done by adding a number $R$ drawn from a discrete uniform distribution with
values between -3 and 3. While in this example, we could have skipped the
`logratio`

function (as this transition kernel is symmetric), but we defined it
so that the user can see an example of it.[^kernel_new] Let's take a look at the
object:

[^kernel_new]: For more details on what the `env`

object contains, see the
manual page of `kernel_new`

.

kernel_unif_int

The object itself is an R environment. If we added more parameters to
`kernel_new`

, we would have seen those as well. Now that we have our transition
kernel, let's give it a first try with the `MCMC`

function.

ans <- MCMC( ll, # The log-likleihood function initial = max(y), # A fair initial guess kernel = kernel_unif_int, # Our new kernel function nsteps = 1000, # 1,000 MCMC draws thin = 10, # We will sample every 10 p. = p # Passing extra parameters to be used by `ll`. )

Notice that for the initial guess, we are using the max of ```
y', which is a
reasonable starting point (the $N$ parameter MUST be at least the max of
```

y').
Since the returning object is an object of class `mcmc`

from the `coda`

R
package, we can use any available method. Let's start by plotting the
chain:

```
plot(ans)
```

As you can see, the trace of the parameter started to go up right away, and then
stayed around 500, the actual population parameter $N$. As the first part of the
chain is useless (we are essentially moving away from the starting point); it is
wise (if not necessary) to start the MCMC chain from the last point of `ans`

.
We can easily do so by just passing `ans`

as a starting point, since `MCMC`

will automatically take the last value of the chain as the starting point of this
new one. This time, let's increase the sample size as well:

ans <- MCMC( ll, initial = ans, # MCMC will use tail(ans, 0) automatically kernel = kernel_unif_int, # same as before nsteps = 10000, # More steps this time thin = 10, # same as before p. = p # same as before )

Let's take a look at the posterior distribution:

plot(ans) summary(ans) table(ans)

A very lovely mixing (at least visually) and a posterior distribution from which we can safely sample parameters.

**Any scripts or data that you put into this service are public.**

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.