# binfer

This package aims to make doing Bayesian things a little easier by giving a clear set of steps that you can easily use to simulate draws from a posterior distribution defined by a specific likelihood and prior.

## The steps

1. `define` a likelihood

2. Input: A data frame, a formula with the response variable and the name of a function that is your likelihood.

3. Output: A data frame with the name of the function given at input as an attribute

4. `assume` a prior distribution

5. Input: A data frame and a function of a single variable

6. Ouput: A data frame with the name of the function added as an attribute

7. `draw` from the posterior distribution

8. Input: A data frame, a control parameter

9. Output: A data frame of draws from the posterior distribution

10. `diagnose` the chain:

11. Input: a data frame of draws from the posterior

12. Output: The same data frame and a bunch of diagnostic plots

13. `clean` the chain

14. Input: Draws from the posterior

15. Output: Burned-in and subsampled draws from the posterior

16. Construct your confidence interval or p-value

17. Use `dplyr` for this

## Examples

Estimate the standard deviation of a normal distribution with a gamma prior:

``````library(binfer)
library(tidyverse)
#> Warning: package 'tibble' was built under R version 3.4.3
#> Conflicts with tidy packages ----------------------------------------------
#> filter(): dplyr, stats
#> lag():    dplyr, stats

my_lik <- function(data, theta) {if (theta > 0) dnorm(data, mean = mean(iris\$Sepal.Width) , sd = theta) else 0}
my_prior <- function(theta) {dgamma(theta, shape = 10, rate = 20)}

posterior <- define(iris, Sepal.Width ~ my_lik) %>%
assume(prior = ~ my_prior) %>%
draw(initial = .43, nbatch = 1e5, blen = 1, scale = .05) %>%
diagnose() %>%
clean(burnin = 0, subsample = 20) %>%
diagnose()
#> Acceptance rate: 0.497534975349753
#> Acceptance rate: 1
``````

``````
posterior %>% summarise(mean = mean(chain),
sd = sd(chain),
lower = quantile(chain, .025),
upper = quantile(chain, .975))
#>        mean         sd     lower    upper
#> 1 0.4382871 0.02558842 0.3925185 0.493049
``````

Estimate the probability of success of a binomnial distribution with a beta prior and compare the analytical solution to the approximate one:

``````binom_test_data <- rbinom(50, prob = .1, size = 1) %>%
tibble(response = .)

binom_lik <- function(data, theta) {if(theta > 0 & theta < 1) dbinom(data, prob = theta, size = 1) else 0}
beta_prior <- function(theta) {dbeta(theta, 1, 2)}

posterior2 <- binom_test_data %>%
define(response ~ binom_lik) %>%
assume(~ beta_prior) %>%
draw(initial = .5, nbatch = 1e5, blen = 1, scale = .15) %>%
diagnose() %>%
clean(burnin = 1000, subsample = 30) %>%
diagnose()
#> Acceptance rate: 0.376063760637606
#> Acceptance rate: 1
``````

``````posterior2 %>%
summarize(mean = mean(chain),
sd = sd(chain),
lower = quantile(chain, .025),
upper = quantile(chain, .975))
#> # A tibble: 1 x 4
#>    mean     sd  lower upper
#>   <dbl>  <dbl>  <dbl> <dbl>
#> 1 0.171 0.0508 0.0853 0.282

ggplot(posterior2, aes(chain)) +
geom_density() +
stat_function(fun = dbeta,
args = list(1 + sum(binom_test_data\$response),
2 + nrow(binom_test_data) - sum(binom_test_data\$response)),
color = "red")
``````

nicksolomon/binfer documentation built on May 21, 2019, 9:21 a.m.