# In-Depth 1: Comparison of Point-Estimates" In bayestestR: Understand and Describe Bayesian Models and Posterior Distributions

```library(knitr)

if (!requireNamespace("see", quietly = TRUE) ||
!requireNamespace("dplyr", quietly = TRUE) ||
!requireNamespace("parameters", quietly = TRUE) ||
!requireNamespace("stringr", quietly = TRUE) ||
!requireNamespace("ggplot2", quietly = TRUE) ||
!requireNamespace("tidyr", quietly = TRUE)) {
knitr::opts_chunk\$set(eval = FALSE)
}

options(knitr.kable.NA = "")
knitr::opts_chunk\$set(
echo = TRUE,
comment = ">",
out.width = "100%",
message = FALSE,
warning = FALSE,
dpi = 150
)
options(digits = 2)
```

This vignette can be referred to by citing the package:

• Makowski, D., Ben-Shachar, M. S., \& Lüdecke, D. (2019). bayestestR: Describing Effects and their Uncertainty, Existence and Significance within the Bayesian Framework. Journal of Open Source Software, 4(40), 1541. https://doi.org/10.21105/joss.01541

# Effect Point-Estimates in the Bayesian Framework

## Introduction

One of the main difference between the Bayesian and the frequentist frameworks is that the former returns a probability distribution for each effect (i.e., a model parameter of interest, such as a regression slope) instead of a single value. However, there is still a need and demand - for reporting or use in further analysis - for a single value (point-estimate) that best characterises the underlying posterior distribution.

There are three main indices used in the literature for effect estimation: - the mean - the median - the MAP (Maximum A Posteriori) estimate (roughly corresponding to the mode - the "peak" - of the distribution)

Unfortunately, there is no consensus about which one to use, as no systematic comparison has ever been done.

In the present work, we will compare these three point-estimates of effect with each other, as well as with the widely known beta, extracted from a comparable frequentist model. These comparisons can help us draw bridges and relationships between these two influential statistical frameworks.

## Experiment 1: Relationship with Error (Noise) and Sample Size

### Methods

We will be carrying out simulation aimed at modulating the following characteristics:

• Model type: linear or logistic.
• "True" effect (known parameters values from which data is drawn): Can be 1 or 0 (no effect).
• Sample size: From 20 to 100 by steps of 10.
• Error: Gaussian noise applied to the predictor with SD uniformly spread between 0.33 and 6.66 (with 1000 different values).

We generated a dataset for each combination of these characteristics, resulting in a total of `2 * 2 * 9 * 1000 = 36000` Bayesian and frequentist models. The code used for generation is available here (please note that it takes usually several days/weeks to complete).

```library(ggplot2)
library(dplyr)
library(tidyr)
library(stringr)
library(see)
library(parameters)

```

### Results

#### Sensitivity to Noise

```df %>%
select(error, true_effect, outcome_type, Coefficient, Median, Mean, MAP) %>%
gather(estimate, value, -error, -true_effect, -outcome_type) %>%
mutate(temp = as.factor(cut(error, 10, labels = FALSE))) %>%
group_by(temp) %>%
mutate(error_group = round(mean(error), 1)) %>%
ungroup() %>%
filter(value < 6) %>%
ggplot(aes(x = error_group, y = value, fill = estimate, group = interaction(estimate, error_group))) +
# geom_hline(yintercept = 0) +
# geom_point(alpha=0.05, size=2, stroke = 0, shape=16) +
# geom_smooth(method="loess") +
geom_boxplot(outlier.shape = NA) +
theme_modern() +
scale_fill_manual(
values = c("Coefficient" = "#607D8B", "MAP" = "#795548", "Mean" = "#FF9800", "Median" = "#FFEB3B"),
name = "Index"
) +
ylab("Point-estimate") +
xlab("Noise") +
facet_wrap(~ outcome_type * true_effect, scales = "free")
```

#### Sensitivity to Sample Size

```df %>%
select(sample_size, true_effect, outcome_type, Coefficient, Median, Mean, MAP) %>%
gather(estimate, value, -sample_size, -true_effect, -outcome_type) %>%
mutate(temp = as.factor(cut(sample_size, 10, labels = FALSE))) %>%
group_by(temp) %>%
mutate(size_group = round(mean(sample_size))) %>%
ungroup() %>%
filter(value < 6) %>%
ggplot(aes(x = size_group, y = value, fill = estimate, group = interaction(estimate, size_group))) +
# geom_hline(yintercept = 0) +
# geom_point(alpha=0.05, size=2, stroke = 0, shape=16) +
# geom_smooth(method="loess") +
geom_boxplot(outlier.shape = NA) +
theme_modern() +
scale_fill_manual(
values = c("Coefficient" = "#607D8B", "MAP" = "#795548", "Mean" = "#FF9800", "Median" = "#FFEB3B"),
name = "Index"
) +
ylab("Point-estimate") +
xlab("Sample size") +
facet_wrap(~ outcome_type * true_effect, scales = "free")
```

#### Statistical Modelling

We fitted a (frequentist) multiple linear regression to statistically test the the predict the presence or absence of effect with the estimates as well as their interaction with noise and sample size.

```df %>%
select(sample_size, error, true_effect, outcome_type, Coefficient, Median, Mean, MAP) %>%
pivot_longer(
c(-sample_size, -error, -true_effect, -outcome_type),
names_to = "estimate"
) %>%
glm(true_effect ~ outcome_type / estimate / value, data = ., family = "binomial") %>%
parameters(df_method = "wald") %>%
select(Parameter, Coefficient, p) %>%
filter(
str_detect(Parameter, "outcome_type"),
str_detect(Parameter, ":value")
) %>%
arrange(desc(Coefficient)) %>%
knitr::kable(digits = 2)
```

This suggests that, in order to delineate between the presence and the absence of an effect, compared to the frequentist's beta coefficient:

• For linear models, the Mean was the better predictor, closely followed by the Median, the MAP and the frequentist Coefficient.
• For logistic models, the MAP was the better predictor, followed by the Median, the Mean and, behind, the frequentist Coefficient.

Overall, the median appears to be a safe choice, maintaining a high performance across different types of models.

## Experiment 2: Relationship with Sampling Characteristics

### Methods

We will be carrying out another simulation aimed at modulating the following characteristics:

• Model type: linear or logistic.
• "True" effect (original regression coefficient from which data is drawn): Can be 1 or 0 (no effect).
• draws: from 10 to 5000 by step of 5 (1000 iterations).
• warmup: Ratio of warmup iterations. from 1/10 to 9/10 by step of 0.1 (9 iterations).

We generated 3 datasets for each combination of these characteristics, resulting in a total of `2 * 2 * 8 * 40 * 9 * 3 = 34560` Bayesian and frequentist models. The code used for generation is avaible here (please note that it takes usually several days/weeks to complete).

```df <- read.csv("https://raw.github.com/easystats/circus/master/data/bayesSim_study2.csv")
```

### Results

#### Sensitivity to number of iterations

```df %>%
select(iterations, true_effect, outcome_type, beta, Median, Mean, MAP) %>%
gather(estimate, value, -iterations, -true_effect, -outcome_type) %>%
mutate(temp = as.factor(cut(iterations, 5, labels = FALSE))) %>%
group_by(temp) %>%
mutate(iterations_group = round(mean(iterations), 1)) %>%
ungroup() %>%
filter(value < 6) %>%
ggplot(aes(x = iterations_group, y = value, fill = estimate, group = interaction(estimate, iterations_group))) +
geom_boxplot(outlier.shape = NA) +
theme_classic() +
scale_fill_manual(
values = c("beta" = "#607D8B", "MAP" = "#795548", "Mean" = "#FF9800", "Median" = "#FFEB3B"),
name = "Index"
) +
ylab("Point-estimate of the true value 0\n") +
xlab("\nNumber of Iterations") +
facet_wrap(~ outcome_type * true_effect, scales = "free")
```

#### Sensitivity to warmup ratio

```df %>%
mutate(warmup = warmup / iterations) %>%
select(warmup, true_effect, outcome_type, beta, Median, Mean, MAP) %>%
gather(estimate, value, -warmup, -true_effect, -outcome_type) %>%
mutate(temp = as.factor(cut(warmup, 3, labels = FALSE))) %>%
group_by(temp) %>%
mutate(warmup_group = round(mean(warmup), 1)) %>%
ungroup() %>%
filter(value < 6) %>%
ggplot(aes(x = warmup_group, y = value, fill = estimate, group = interaction(estimate, warmup_group))) +
geom_boxplot(outlier.shape = NA) +
theme_classic() +
scale_fill_manual(
values = c("beta" = "#607D8B", "MAP" = "#795548", "Mean" = "#FF9800", "Median" = "#FFEB3B"),
name = "Index"
) +
ylab("Point-estimate of the true value 0\n") +
xlab("\nNumber of Iterations") +
facet_wrap(~ outcome_type * true_effect, scales = "free")
```

## Discussion

Conclusions can be found in the guidelines section article.

# Suggestions

If you have any advice, opinion or such, we encourage you to let us know by opening an discussion thread or making a pull request.

## Try the bayestestR package in your browser

Any scripts or data that you put into this service are public.

bayestestR documentation built on July 26, 2021, 5:08 p.m.