In giuliomorina/DiceEnterprise: Bernoulli Factory and Dice Enterprise for Rational Functions

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.width=7, fig.height=4,
  cache=TRUE
)
require("DiceEnterprise")
set.seed(10, "L'Ecuyer-CMRG")

Introduction

The DiceEnterprise package provides an automatic way to construct a Bernoulli Factory/Dice Enterprise algorithm for rational functions between simplices $f: \Delta^m \rightarrow \Delta^v$. That is, given an $m$-sided die, a $v$ sided-one is produced where the probability of each of the $v$ faces is a rational function. More formally, $f(\boldsymbol{p})$ is of the following form: $$ f(\boldsymbol{p}) = f(p_1,\ldots,p_m) = \frac{1}{C(\boldsymbol{p})}\left(G_1(\boldsymbol{p}),\ldots,G_v(\boldsymbol{p})\right) $$ where each $G_i(\boldsymbol{p})$ and $C(\boldsymbol{p})$ are polynomials with real coefficients.

The Bernoulli Factory is a special case of this setting. In particular, assume a $p$-coin is given. Then, the package can construct an $f(p)$-coin where $f(p)$ is a rational function.

Bernoulli Factory

The user can define a function that tosses a $p$-coin and returns $1$ with probability $p$ and 2 with probability $1-p$ (notice that it must return 1 and 2). The function should have as input the number n of required tosses. For instance the following function tosses a $\frac{3}{4}$-coin

toss_coin <- function(n) {
  sample(1:2, size = n, replace = TRUE, prob = c(3/4,1/4))
}

Then, the user needs to define the rational function $f(p)$ in a specific format. For instance, assume that $$ f(p) = \frac{\sqrt{2}p^3}{(\sqrt{2}-5)p^3+11p^2-9p+3} $$ so that $$ 1-f(p) = \frac{-5p^3+11p^2-9p+3}{(\sqrt{2}-5)p^3+11p^2-9p+3} $$ Then, the user needs to specify the numerators of $f(p)$ and $1-f(p)$ in the following format:

f_1 <- list(coeff = c(sqrt(2)), power = c(3)) #f(p)
f_2 <- list(coeff = c(-5,11,-9,3), power = c(3,2,1,0)) #1-f(p)

Notice that the polynomials are represented as a list: the first element is a vector of coefficients and the second element is a vector describing the powers of $p$. Finally, the user can construct a new Bernoulli Factory running the following code:

bf <- BernoulliFactory$new(f_1 = f_1, f_2 = f_2) #f_1 = f(p), f_2 = 1-f(p)

and toss an $f(p)$-coin by running

fp_tosses <- bf$sample(n = 10, roll.fun = toss_coin) #Produces 10 tosses of the f(p)-coin
print(fp_tosses)

Diagnosis and extra options

Printing the BernoulliFactory object gives information on the dimension of the generated fine and connected ladder. When a BernoulliFactory object is created, if the option verbose is set equal to TRUE, then more information are printed on the different steps of the construction.

bf <- BernoulliFactory$new(f_1 = f_1, f_2 = f_2, verbose = TRUE)
print(bf)

The method evaluate allows to evaluate the value of $f(p)$ for a given $p$.

print(bf$evaluate(3/4))

When a sample is requested, Coupling From the Past is used. Notice that in the Bernoulli Factory case (or more in general when the given die has only 2 faces), a monotonic implementation is used. The method sample allows to set the option verbose = TRUE. In this case the output of sample is a list where the first argument is the obtained sample and the second argument is a vector containing the number of rolls required. The same method allows to specify the number of cores used (default 1). Notice that multicore is not supported on Windows. The method sample takes also double_time as an input. If double_time = FALSE (default), the time step of CFTP is increased by one at each iteration leading, otherwise it is doubled.

fp_tosses <- bf$sample(n = 1000, roll.fun = toss_coin, num_cores = 2, verbose = TRUE, double_time= TRUE) #Produces 1000 tosses of the f(p)-coin, using 2 cores and doubling the time step at each iteration of CFTP.
print(table(fp_tosses[[1]])/1000) #Empirical probabilities. Notice that the theoretical ones are given by print(bf$evaluate(3/4))
print(paste0("Average number of tosses required: ",mean(fp_tosses[[2]])))

The function plot.confidence.interval allows to print the estimated probabilities with 95% confidence interval. The stars indicate the true theoretical values.

plot.confidence.interval(fp_tosses[[1]],print(bf$evaluate(3/4)))

Notice that the package supports only functions defined from $(0,1)$ to $[0,1]$ and not from subset $S \subset (0,1)$. For instance, if the user defines a Bernoulli Factory for the function $f(p) = 2p$, an error is returned:

bf_amp <- BernoulliFactory$new(f_1 = list(2,1), f_2 = list(c(1,-2),c(0,1)))

When the function reaches close 1 or 0 within the interval $p \in (0,1)$, the construction of the ladder is computationally difficult and a runtime error may occur. Take for instance $f(p) = (x-1/2)^2 + 0.999$. The user can increase the timeout threshold by specifying the parameter threshold when constructing the BernoulliFactory object (default 100 iterations).

bf_error <- BernoulliFactory$new(f_1 = list(c(-1,1,0.749),c(2,1,0)), 
                                 f_2 = list(c(1,-1,0.251),c(2,1,0)))
bf_no_error <- BernoulliFactory$new(f_1 = list(c(-1,1,0.749),c(2,1,0)), 
                                 f_2 = list(c(1,-1,0.251),c(2,1,0)),
                                 threshold = 500)

Dice Enterprise

The procedure is similar to the previous. Assume that the given function $f(\boldsymbol{p}): \Delta^m \rightarrow \Delta^v$ is $$ f(p_1,p_2,p_3) \propto (p_1p_2^2+\sqrt{2}p_2^2p_3^2, 4p_1^4p_2^7+1/2+3p_1p_2^2p_3^3,7p_1p_2^3p_3^4+2p_3^2) $$ The user needs to define a list that specify the polymomial $f(\boldsymbol{p})$. Each element corresponds to $f_i(\boldsymbol{p})$ and each of this function is determined by a list itself containing two elements:

A vector of the coefficients (as in the Bernoulli Factory case)
A matrix or a vector of string defining the power of the variables. If a matrix is given, it has to have $m$ columns and each row defines the power of $p_1,p_2,\ldots,p_m$ respectively. For instance $p_1p_2^2+\sqrt{2}p_2^2p_3^2$ is translated into the matrix $$ \begin{bmatrix} 1 & 2 & 0 \ 0 & 2 & 2 \end{bmatrix} $$ Alternatively, the same powers can be defined via a vector of strings. The powers of the previous polynomial $p_1p_2^2+\sqrt{2}p_2^2p_3^2$ can also be defined as c("120","022"). Notice that this method works only if there are no powers greater than 10.

The polynomial $f(\boldsymbol{p})$ is then defined as

f_dice <- list(
  list(c(1,sqrt(2)),c("120","022")),
  list(c(4,1/2,3),c("470","000","123")),
  list(c(7,2),matrix(c(1,3,4,0,0,2),byrow=TRUE,ncol=3))
)

To construct a Dice Enterprise with the defined polynomial the following command is run:

de <- DiceEnterprise$new(f_dice)

and an $f(\boldsymbol{p})$-die is rolled by giving a function that rolls the original $m$-sided die and using the sample method.

roll_die <- function(n) {
  sample(1:3, size = n, replace = TRUE, prob = c(1/5,1/4,1-1/5-1/4))
} #The original die has probability 1/5, 1/4, 11/20
sample_die <- de$sample(n = 10, roll.fun = roll_die)
print(sample_die)

Diagnosis and extra options

Analogously as described in the Bernoulli Factory case, the method sample accepts the same inputs as in the Bernoulli Factory case. The function plot.confidence.interval returns a plot of the estimates together with a 95% confidence interval.

sample_die <- de$sample(n = 1000, roll.fun = roll_die, num_cores = 2, verbose = TRUE, double_time = TRUE) #Produces 1000 rolls of the f(p)-die, using 2 cores
print(table(sample_die[[1]])/1000) #Empirical probabilities. Notice that the theoretical ones are given by print(de$evaluate(c(1/5,1/4,1-1/5-1/4)))
print(de$evaluate(c(1/5,1/4,1-1/5-1/4)))
print(paste0("Average number of rolls required: ", mean(sample_die[[2]])))
plot.confidence.interval(sample_die[[1]],de$evaluate(c(1/5,1/4,1-1/5-1/4)))

Example from the paper

This reproduces example 2 from the paper and simulates from the multivariate ladder $$ \pi(p_1,p_2,p_3) \propto \left(\sqrt{2}p_1^3,p_1^2p_3,\frac{1}{4}p_1p_2^2,2p_1p_2p_3,\frac{1}{2}p_1p_3^2,\frac{3}{4}p_2^2p_3\right) $$

de_ex2 <- DiceEnterprise$new(G=list(
  list(sqrt(2),"300"),
  list(1,"201"),
  list(1/4,"120"),
  list(2,"111"),
  list(1/2,"102"),
  list(3/4,"021")
))
#Define the original die
true_prob_original_die <- c(1/5,1/4,11/20)
roll_die <- function(n) {
  sample(1:3, size = n, replace = TRUE, prob = true_prob_original_die)
} 
#Get a sample of size 1000 from the multivariate ladder
#and plot the estimates with confidence intervals
set.seed(17)
sample_ex2 <- de_ex2$sample(n=1000, roll.fun = roll_die, verbose = TRUE)
print(paste0("Average number of rolls required: ", mean(sample_ex2[[2]])))
plot.confidence.interval(sample_ex2[[1]],de_ex2$evaluate(true_prob_original_die))

Independent coins

Assume that instead of an $m$-sided die, $m$ independent coins are given, each one with its own probability of landing heads and denoted by $p_i$. In general, $(p_1,\ldots,p_m) \not\in \Delta^m$. Consider as an example, that we have access to 3 independent coins of biases $(0.4,0.7,0.55)$ and wish to sample from the distribution $f(\boldsymbol{p}) = \frac{1}{p_1+p_2+p_3}\left(p_1,p_2,p_3\right) = (8/33,14/33,1/3)$. There are several ways to define a die whose probabilites are a function of $(p_1,\ldots,p_m)$. The package provides native support for three different definitions:

Scenario 1: tossing all the coins

Considering constructing an $(m+2)$-sided die with the following probabilities of rolling ${1,2\ldots,m+2}$. $$ \begin{split} &q_1 = p_1\cdot p_2\cdot\ldots \cdot p_m \ &q_2 = (1-p_1)\cdot p_2 \cdot \ldots \cdot p_m \ &q_3 = p_1 \cdot (1-p_2) \cdot p_3 \cdot \ldots \cdot p_m \ &\ldots \ &q_{m+1} = p_1 \cdot p_2 \cdot \ldots \cdot p_{m-1} \cdot (1-p_m) \ &q_{m+2} = 1-\sum_{i=0}^m q_i \end{split} $$ It is straightforward to roll such die by tossing all the $m$ independent coins at the same time. In particular, if all the tosses result in heads, then 1 is returned. If all the tosses are heads, except for the $l$th coin, then $l+1$ is returned. In all other cases, $m+2$ is returned. Finally, assume that $f(\boldsymbol{p}): (0,1)^m \rightarrow \Delta^v$ is a function of the original probabilities. It is possible to construct a function $\tilde{f}(\boldsymbol{q}): \Delta^{m+2} \rightarrow \Delta^v$ by substituing: $$ p_i = \frac{q_1}{q_1+q_{i+1}}, \qquad i \in {1,2,\ldots,m} $$

The user needs to first define a function that tosses all the three coins. Notice again that heads corresponds to 1 and tails to 2 (not to 0).

toss.all.coins <- function(probs) { 
  return(sapply(probs, function(p) {sample(1:2, size = 1, prob = c(p,1-p))})) #1 or 2 (not 0))
}

The function $f(\boldsymbol{p})$ can be expressed as a function of the transformed variables $\boldsymbol{q} = (q_1,\ldots,q_5)$ previously described as: $$ \begin{split} f(\boldsymbol{q}) &= \frac{1}{C(\boldsymbol{q})}\left((q_0+q_2)(q_0+q_3),(q_0+q_1)(q_0+q_3),(q_0+q_1)(q_0+q_2)\right) \ &= \frac{1}{C(\boldsymbol{q})}\left(q_0^2+q_0q_2+q_0q_3+q_2q_3,q_0^2+q_0q_1+q_0q_3+q_1q_3,q_0^2+q_0q_1+q_0q_2+q_1q_2\right) \end{split} $$ where $C(\boldsymbol{q}) = 3(q_0+q_1)(q_0+q_2)(q_0+q_3)\sum_{i=1}^3 \frac{1}{q_0+q_i}$. As before, the user needs to define the three polynomials as a list of coefficients and powers:

f_indep_coins1 <- list(
  list(rep(1,4),c("20000","10100","10010","00110")),
  list(rep(1,4),c("20000","11000","10010","01010")),
  list(rep(1,4),c("20000","11000","10100","01100"))
)

The package provides a class CoinsEnterprise to deal with this class of problems and automatically define the corresponding die. In particular, when an object of such class is initialized the argument die_type specifies which type of construction is considered. For scenario 1 this corresponds to die_type = "toss_all".

ce1 <- CoinsEnterprise$new(f_indep_coins1, toss.coins = toss.all.coins, num_coins = 3, die_type = "toss_all")

indep_coins_probs <- c(0.4,0.7,0.55)
result <- ce1$sample(n = 1000, num_cores = 2, verbose = TRUE,  double_time = FALSE, probs = indep_coins_probs) #the argument probs is passed to toss.coins
print(table(result[[1]])/1000) #Empirical probabilities. Notice that the theoretical ones are given by
print(indep_coins_probs/sum(indep_coins_probs))
print(paste0("Average number of rolls required: ", mean(result[[2]])))
print(paste0("Average number of tosses required: ", length(indep_coins_probs)*mean(result[[2]])))
plot.confidence.interval(result[[1]],indep_coins_probs/sum(indep_coins_probs))

Scenario 2: first heads

Consider constructing an $(m+1)$-sided die with the following probabilities of rolling ${1,2,\ldots,m+1}$. $$ \begin{split} q_1 &= p_1 \ q_2 &= (1-p_1)\cdot p_2 \ q_3 &= (1-p_1)\cdot (1-p_2)\cdot p_3 \ &\ldots \ q_m &= (1-p_1)\cdot (1-p_2)\cdot \ldots\cdot (1-p_{m-1})\cdot p_m \ q_{m+1} &= (1-p_1)\cdot (1-p_2)\cdot \ldots\cdot (1-p_{m-1})\cdot (1-p_m) \end{split} $$ It is straightforward to roll such die by flipping in order the $m$ independent coins until the first heads is returned at the $l$th toss. In this case, $l$ is returned as a result of rolling the die. If all the tosses returns tails, then $l+1$ is returned. If $f(\boldsymbol{p}): (0,1)^m \rightarrow \Delta^v$ is a function of the original probabilities, it is possible to construct a function $\tilde{f}(\boldsymbol{q}): \Delta^{m+2} \rightarrow \Delta^v$ by substituing: $$ p_i = \frac{q_i}{1-\sum_{k=1}^{i-1}q_k}, \qquad i \in {1,2,\ldots,m} $$

The user needs to first define a function that tosses all the three coins until a heads is obtained. Notice again that heads corresponds to 1 and tails to 2 (not to 0).

toss.until.heads <- function(probs) {
  m <- length(probs)
  res <- rep(NA, m)
  for(i in 1:m) {
    res[i] <- sample(1:2, size = 1, prob = c(probs[i],1-probs[i]))
    if(res[i] == 1) { break } #Stop loop if heads is obtained
  }
  return(res)
}

The function $f(\boldsymbol{p})$ can be expressed as a function of the transformed variables $\boldsymbol{q} = (q_1,\ldots,q_4)$ previously described as: $$ \begin{split} f(\boldsymbol{q}) &= \frac{1}{C(\boldsymbol{q})}\left(q_1(1-q_1)(1-q_1-q_2), q_2(1-q_1-q_2), q_3(1-q_1) \right) \ &= \frac{1}{C(\boldsymbol{q})}\left(q_1 - 2q_1^2+q_1^3-q_1q_2+q_1^2q_2, q_2-q_1q_2-q_2^2,q_3-q_1q_3 \right) \end{split} $$ where $C(\boldsymbol{q}) = q_1(1-q_1)(1-q_1-q_2)+q_2(1-q_1-q_2)+q_3(1-q_1)$. As before, the user needs to define the three polynomials as a list of coefficients and powers:

f_indep_coins2 <- list(
  list(c(1,-2,1,-1,1),c("1000","2000","3000","1100","2100")),
  list(c(1,-1,-1),c("0100","1100","0200")),
  list(c(1,-1),c("0010","1010"))
)

In this case, an object of the class CoinsEnterprise must be initialized with argument die_type = "first_heads".

ce2 <- CoinsEnterprise$new(f_indep_coins2, toss.coins = toss.until.heads, num_coins = 3, die_type = "first_heads")

Getting a sample from the newly create die is done analogously as before. Notice that the number of rolls does not correspond to the number of tosses of the coins. In particular, the number of tosses is obtained by multiplying the number of rolls by the expected number of tosses of the function toss.until.heads. In this case, it is given by $p_1+2(1-p_1)p_2+3(1-p_1)(1-p_2) = 1.78$.

result <- ce2$sample(n = 1000, num_cores = 2, verbose = TRUE, double_time = FALSE, probs = indep_coins_probs) #the argument probs is passed to toss.coins
print(table(result[[1]])/1000) #Empirical probabilities. Notice that the theoretical ones are given by
print(indep_coins_probs/sum(indep_coins_probs))
print(paste0("Average number of rolls required: ", mean(result[[2]])))
print(paste0("Average number of tosses required: ", (indep_coins_probs[1]+
                                                       2*(1-indep_coins_probs[1])*indep_coins_probs[2]+
                                                       3*(1-indep_coins_probs[1])*(1-indep_coins_probs[2]))*
                                                       mean(result[[2]])))
plot.confidence.interval(result[[1]],indep_coins_probs/sum(indep_coins_probs))

Scenario 3: uniformly

Consider constructing an $(m+1)$-sided die with the following probabilities of rolling ${1,\ldots,m}$. \begin{align} q_1 &= \frac{1}{m}p_1 \ q_2 &= \frac{1}{m}p_2 \ &\ldots \ q_m &= \frac{1}{m}p_m \ q_{m+1} &= 1-\frac{1}{m}\sum_{i=1}^m p_i \ \end{align} To roll such die it is just sufficient to select uniformly at random a coin $i$ to toss. If it lands heads, then $i$ is returned, otherwise $m+1$ is returned. Given a function $f(\boldsymbol{p}):(0,1)^m \rightarrow \Delta^v$ of the independent coins, it is straightforward to construct a function $f(\boldsymbol{q}): \Delta^{m+1} \rightarrow \Delta^v$ of the new constructe die by substituting: $$ p_i = mq_i, \qquad i \in {1,2,\ldots,m} $$ The user needs to define a function that takes as input which coin has to be tossed and return $1$ if the coin lands heads.

toss.coins.single <- function(which_coin, probs) {
  return(sample(c(1,2), size = 1, prob = c(probs[which_coin], 1-probs[which_coin])))
}

The function $f(\boldsymbol{p})$ previously described can be expresses as a function of the transformed variables simply as $$ f(\boldsymbol{q}) = \frac{1}{q_1+q_2+q_3}(q_1,q_2,q_3) $$ As before, the user needs to define three polynomials as a list of coefficients and powers. Notice that there are $4$ faces of the new die, and not three.

f_indep_coins3 <- list(
  list(1, "1000"),
  list(1, "0100"),
  list(1, "0010")
)

In this case, an object of the class CoinsEnterprise must be initialized with argument die_type = "uniform".

ce3 <- CoinsEnterprise$new(f_indep_coins3, toss.coins = toss.coins.single, num_coins = 3, die_type = "uniform")

Getting a sample from the newly create die is analogous as before. In this case the number of rolls is equal to number of tosses of the coins.

result <- ce3$sample(n = 1000, num_cores = 2, verbose = TRUE, double_time = FALSE, probs = indep_coins_probs) #the argument probs is passed to toss.coins.single
print(table(result[[1]])/1000) #Empirical probabilities. Notice that the theoretical ones are given by
print(indep_coins_probs/sum(indep_coins_probs))
print(paste0("Average number of tosses required: ", mean(result[[2]])))
plot.confidence.interval(result[[1]],indep_coins_probs/sum(indep_coins_probs))

roll.uniform <- function(n,probs_coins,probs_unif = rep(1,length(probs_coins))) {
  m <- length(probs_coins)
  res <- numeric(n)
  for(k in 1:n) {
  i <- sample(1:m, size = 1, prob = probs_unif)
  res[k] <- (sample(c(i,m+1), size = 1, prob = c(probs_coins[i],1-probs_coins[i])))
  }
  return(res)
}

indep_coins_probs <- seq(0.01,0.09,by=0.005)
f_unif <- vector("list", length = length(indep_coins_probs))
for(i in 1:length(indep_coins_probs)) {
  zeros <- rep(0, length(indep_coins_probs)+1)
  zeros[i] <- 1
  f_unif[[i]] <- list(c(1), paste0(zeros,collapse=""))
}

de_uniform <- DiceEnterprise$new(f_unif)

size_sample <- 10000
sample_coins_uniform <- de_uniform$sample(n = size_sample, roll.fun = roll.uniform, probs_coins = indep_coins_probs, verbose = TRUE)
table(sample_coins_uniform[[1]])/size_sample
mean(sample_coins_uniform[[2]])
length(indep_coins_probs)/sum(indep_coins_probs)

#(0.2501 - 1 x + 1 x^2)/((0.3 - 1 x + 1 x^2)+(0.2501 - 1 x + 1 x^2))
# bf_bug <- BernoulliFactory$new(
#   f_1 = list(c(0.2501,-1,1), c(0,1,2)),
#   f_2 = list(c(0.3,-1,1), c(0,1,2)),
# )
#(2-5x+5x^2)/(10x+10)
bf_bug <- DiceEnterprise$new(G = list(
  list(c(2,-5,5),c("00","10","20")),
  list(c(10,10),c("10","00"))
))