In-Depth 1: Comparison of Point-Estimates"

library(knitr)

if (!requireNamespace("see", quietly = TRUE) ||
  !requireNamespace("dplyr", quietly = TRUE) ||
  !requireNamespace("parameters", quietly = TRUE) ||
  !requireNamespace("stringr", quietly = TRUE) ||
  !requireNamespace("ggplot2", quietly = TRUE) ||
  !requireNamespace("tidyr", quietly = TRUE)) {
  knitr::opts_chunk$set(eval = FALSE)
}

options(knitr.kable.NA = "")
knitr::opts_chunk$set(
  echo = TRUE,
  comment = ">",
  out.width = "100%",
  message = FALSE,
  warning = FALSE,
  dpi = 150
)
options(digits = 2)

This vignette can be referred to by citing the package:


Effect Point-Estimates in the Bayesian Framework

Introduction

One of the main difference between the Bayesian and the frequentist frameworks is that the former returns a probability distribution for each effect (i.e., a model parameter of interest, such as a regression slope) instead of a single value. However, there is still a need and demand - for reporting or use in further analysis - for a single value (point-estimate) that best characterises the underlying posterior distribution.

There are three main indices used in the literature for effect estimation: - the mean - the median - the MAP (Maximum A Posteriori) estimate (roughly corresponding to the mode - the "peak" - of the distribution)

Unfortunately, there is no consensus about which one to use, as no systematic comparison has ever been done.

In the present work, we will compare these three point-estimates of effect with each other, as well as with the widely known beta, extracted from a comparable frequentist model. These comparisons can help us draw bridges and relationships between these two influential statistical frameworks.

Experiment 1: Relationship with Error (Noise) and Sample Size

Methods

We will be carrying out simulation aimed at modulating the following characteristics:

We generated a dataset for each combination of these characteristics, resulting in a total of 2 * 2 * 9 * 1000 = 36000 Bayesian and frequentist models. The code used for generation is available here (please note that it takes usually several days/weeks to complete).

library(ggplot2)
library(dplyr)
library(tidyr)
library(stringr)
library(see)
library(parameters)

df <- read.csv("https://raw.github.com/easystats/circus/master/data/bayesSim_study1.csv")

Results

Sensitivity to Noise

df %>%
  select(error, true_effect, outcome_type, Coefficient, Median, Mean, MAP) %>%
  gather(estimate, value, -error, -true_effect, -outcome_type) %>%
  mutate(temp = as.factor(cut(error, 10, labels = FALSE))) %>%
  group_by(temp) %>%
  mutate(error_group = round(mean(error), 1)) %>%
  ungroup() %>%
  filter(value < 6) %>%
  ggplot(aes(x = error_group, y = value, fill = estimate, group = interaction(estimate, error_group))) +
  # geom_hline(yintercept = 0) +
  # geom_point(alpha=0.05, size=2, stroke = 0, shape=16) +
  # geom_smooth(method="loess") +
  geom_boxplot(outlier.shape = NA) +
  theme_modern() +
  scale_fill_manual(
    values = c("Coefficient" = "#607D8B", "MAP" = "#795548", "Mean" = "#FF9800", "Median" = "#FFEB3B"),
    name = "Index"
  ) +
  ylab("Point-estimate") +
  xlab("Noise") +
  facet_wrap(~ outcome_type * true_effect, scales = "free")

Sensitivity to Sample Size

df %>%
  select(sample_size, true_effect, outcome_type, Coefficient, Median, Mean, MAP) %>%
  gather(estimate, value, -sample_size, -true_effect, -outcome_type) %>%
  mutate(temp = as.factor(cut(sample_size, 10, labels = FALSE))) %>%
  group_by(temp) %>%
  mutate(size_group = round(mean(sample_size))) %>%
  ungroup() %>%
  filter(value < 6) %>%
  ggplot(aes(x = size_group, y = value, fill = estimate, group = interaction(estimate, size_group))) +
  # geom_hline(yintercept = 0) +
  # geom_point(alpha=0.05, size=2, stroke = 0, shape=16) +
  # geom_smooth(method="loess") +
  geom_boxplot(outlier.shape = NA) +
  theme_modern() +
  scale_fill_manual(
    values = c("Coefficient" = "#607D8B", "MAP" = "#795548", "Mean" = "#FF9800", "Median" = "#FFEB3B"),
    name = "Index"
  ) +
  ylab("Point-estimate") +
  xlab("Sample size") +
  facet_wrap(~ outcome_type * true_effect, scales = "free")

Statistical Modelling

We fitted a (frequentist) multiple linear regression to statistically test the the predict the presence or absence of effect with the estimates as well as their interaction with noise and sample size.

df %>%
  select(sample_size, error, true_effect, outcome_type, Coefficient, Median, Mean, MAP) %>%
  pivot_longer(
    c(-sample_size, -error, -true_effect, -outcome_type),
    names_to = "estimate"
  ) %>%
  glm(true_effect ~ outcome_type / estimate / value, data = ., family = "binomial") %>%
  parameters(df_method = "wald") %>%
  select(Parameter, Coefficient, p) %>%
  filter(
    str_detect(Parameter, "outcome_type"),
    str_detect(Parameter, ":value")
  ) %>%
  arrange(desc(Coefficient)) %>%
  knitr::kable(digits = 2)

This suggests that, in order to delineate between the presence and the absence of an effect, compared to the frequentist's beta coefficient:

Overall, the median appears to be a safe choice, maintaining a high performance across different types of models.

Experiment 2: Relationship with Sampling Characteristics

Methods

We will be carrying out another simulation aimed at modulating the following characteristics:

We generated 3 datasets for each combination of these characteristics, resulting in a total of 2 * 2 * 8 * 40 * 9 * 3 = 34560 Bayesian and frequentist models. The code used for generation is avaible here (please note that it takes usually several days/weeks to complete).

df <- read.csv("https://raw.github.com/easystats/circus/master/data/bayesSim_study2.csv")

Results

Sensitivity to number of iterations

df %>%
  select(iterations, true_effect, outcome_type, beta, Median, Mean, MAP) %>%
  gather(estimate, value, -iterations, -true_effect, -outcome_type) %>%
  mutate(temp = as.factor(cut(iterations, 5, labels = FALSE))) %>%
  group_by(temp) %>%
  mutate(iterations_group = round(mean(iterations), 1)) %>%
  ungroup() %>%
  filter(value < 6) %>%
  ggplot(aes(x = iterations_group, y = value, fill = estimate, group = interaction(estimate, iterations_group))) +
  geom_boxplot(outlier.shape = NA) +
  theme_classic() +
  scale_fill_manual(
    values = c("beta" = "#607D8B", "MAP" = "#795548", "Mean" = "#FF9800", "Median" = "#FFEB3B"),
    name = "Index"
  ) +
  ylab("Point-estimate of the true value 0\n") +
  xlab("\nNumber of Iterations") +
  facet_wrap(~ outcome_type * true_effect, scales = "free")

Sensitivity to warmup ratio

df %>%
  mutate(warmup = warmup / iterations) %>%
  select(warmup, true_effect, outcome_type, beta, Median, Mean, MAP) %>%
  gather(estimate, value, -warmup, -true_effect, -outcome_type) %>%
  mutate(temp = as.factor(cut(warmup, 3, labels = FALSE))) %>%
  group_by(temp) %>%
  mutate(warmup_group = round(mean(warmup), 1)) %>%
  ungroup() %>%
  filter(value < 6) %>%
  ggplot(aes(x = warmup_group, y = value, fill = estimate, group = interaction(estimate, warmup_group))) +
  geom_boxplot(outlier.shape = NA) +
  theme_classic() +
  scale_fill_manual(
    values = c("beta" = "#607D8B", "MAP" = "#795548", "Mean" = "#FF9800", "Median" = "#FFEB3B"),
    name = "Index"
  ) +
  ylab("Point-estimate of the true value 0\n") +
  xlab("\nNumber of Iterations") +
  facet_wrap(~ outcome_type * true_effect, scales = "free")

Discussion

Conclusions can be found in the guidelines section article.

Suggestions

If you have any advice, opinion or such, we encourage you to let us know by opening an discussion thread or making a pull request.



Try the bayestestR package in your browser

Any scripts or data that you put into this service are public.

bayestestR documentation built on July 26, 2021, 5:08 p.m.