p_direction: Probability of Direction (pd)

View source: R/p_direction.R

p_directionR Documentation

Probability of Direction (pd)

Description

Compute the Probability of Direction (pd, also known as the Maximum Probability of Effect - MPE). It varies between ⁠50%⁠ and ⁠100%⁠ (i.e., 0.5 and 1) and can be interpreted as the probability (expressed in percentage) that a parameter (described by its posterior distribution) is strictly positive or negative (whichever is the most probable). It is mathematically defined as the proportion of the posterior distribution that is of the median's sign. Although differently expressed, this index is fairly similar (i.e., is strongly correlated) to the frequentist p-value.

Note that in some (rare) cases, especially when used with model averaged posteriors (see weighted_posteriors() or brms::posterior_average), pd can be smaller than 0.5, reflecting high credibility of 0.

Usage

p_direction(x, ...)

pd(x, ...)

## S3 method for class 'numeric'
p_direction(x, method = "direct", null = 0, ...)

## S3 method for class 'data.frame'
p_direction(x, method = "direct", null = 0, ...)

## S3 method for class 'MCMCglmm'
p_direction(x, method = "direct", null = 0, ...)

## S3 method for class 'emmGrid'
p_direction(x, method = "direct", null = 0, ...)

## S3 method for class 'stanreg'
p_direction(
  x,
  effects = c("fixed", "random", "all"),
  component = c("location", "all", "conditional", "smooth_terms", "sigma",
    "distributional", "auxiliary"),
  parameters = NULL,
  method = "direct",
  null = 0,
  ...
)

## S3 method for class 'brmsfit'
p_direction(
  x,
  effects = c("fixed", "random", "all"),
  component = c("conditional", "zi", "zero_inflated", "all"),
  parameters = NULL,
  method = "direct",
  null = 0,
  ...
)

## S3 method for class 'BFBayesFactor'
p_direction(x, method = "direct", null = 0, ...)

Arguments

x

Vector representing a posterior distribution. Can also be a Bayesian model (stanreg, brmsfit or BayesFactor).

...

Currently not used.

method

Can be "direct" or one of methods of density estimation, such as "kernel", "logspline" or "KernSmooth". If "direct" (default), the computation is based on the raw ratio of samples superior and inferior to 0. Else, the result is based on the Area under the Curve (AUC) of the estimated density function.

null

The value considered as a "null" effect. Traditionally 0, but could also be 1 in the case of ratios.

effects

Should results for fixed effects, random effects or both be returned? Only applies to mixed models. May be abbreviated.

component

Should results for all parameters, parameters for the conditional model or the zero-inflated part of the model be returned? May be abbreviated. Only applies to brms-models.

parameters

Regular expression pattern that describes the parameters that should be returned. Meta-parameters (like lp__ or prior_) are filtered by default, so only parameters that typically appear in the summary() are returned. Use parameters to select specific parameters for the output.

Details

What is the pd?

The Probability of Direction (pd) is an index of effect existence, ranging from ⁠50%⁠ to ⁠100%⁠, representing the certainty with which an effect goes in a particular direction (i.e., is positive or negative). Beyond its simplicity of interpretation, understanding and computation, this index also presents other interesting properties:

  • It is independent from the model: It is solely based on the posterior distributions and does not require any additional information from the data or the model.

  • It is robust to the scale of both the response variable and the predictors.

  • It is strongly correlated with the frequentist p-value, and can thus be used to draw parallels and give some reference to readers non-familiar with Bayesian statistics.

Relationship with the p-value

In most cases, it seems that the pd has a direct correspondence with the frequentist one-sided p-value through the formula pone sided = 1 - p(d)/100 and to the two-sided p-value (the most commonly reported one) through the formula ptwo sided = 2 * (1 - p(d)/100). Thus, a two-sided p-value of respectively .1, .05, .01 and .001 would correspond approximately to a pd of ⁠95%⁠, ⁠97.5%⁠, ⁠99.5%⁠ and ⁠99.95%⁠. See also pd_to_p().

Methods of computation

The most simple and direct way to compute the pd is to 1) look at the median's sign, 2) select the portion of the posterior of the same sign and 3) compute the percentage that this portion represents. This "simple" method is the most straightforward, but its precision is directly tied to the number of posterior draws. The second approach relies on density estimation. It starts by estimating the density function (for which many methods are available), and then computing the area under the curve (AUC) of the density curve on the other side of 0.

Strengths and Limitations

Strengths: Straightforward computation and interpretation. Objective property of the posterior distribution. 1:1 correspondence with the frequentist p-value.

Limitations: Limited information favoring the null hypothesis.

Value

Values between 0.5 and 1 corresponding to the probability of direction (pd).

Note that in some (rare) cases, especially when used with model averaged posteriors (see weighted_posteriors() or brms::posterior_average), pd can be smaller than 0.5, reflecting high credibility of 0. To detect such cases, the method = "direct" must be used.

Note

There is also a plot()-method implemented in the see-package.

References

Makowski D, Ben-Shachar MS, Chen SHA, Lüdecke D (2019) Indices of Effect Existence and Significance in the Bayesian Framework. Frontiers in Psychology 2019;10:2767. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.3389/fpsyg.2019.02767")}

See Also

pd_to_p() to convert between Probability of Direction (pd) and p-value.

Examples

library(bayestestR)

# Simulate a posterior distribution of mean 1 and SD 1
# ----------------------------------------------------
posterior <- rnorm(1000, mean = 1, sd = 1)
p_direction(posterior)
p_direction(posterior, method = "kernel")

# Simulate a dataframe of posterior distributions
# -----------------------------------------------
df <- data.frame(replicate(4, rnorm(100)))
p_direction(df)
p_direction(df, method = "kernel")
## Not run: 
# rstanarm models
# -----------------------------------------------
if (require("rstanarm")) {
  model <- rstanarm::stan_glm(mpg ~ wt + cyl,
    data = mtcars,
    chains = 2, refresh = 0
  )
  p_direction(model)
  p_direction(model, method = "kernel")
}

# emmeans
# -----------------------------------------------
if (require("emmeans")) {
  p_direction(emtrends(model, ~1, "wt"))
}

# brms models
# -----------------------------------------------
if (require("brms")) {
  model <- brms::brm(mpg ~ wt + cyl, data = mtcars)
  p_direction(model)
  p_direction(model, method = "kernel")
}

# BayesFactor objects
# -----------------------------------------------
if (require("BayesFactor")) {
  bf <- ttestBF(x = rnorm(100, 1, 1))
  p_direction(bf)
  p_direction(bf, method = "kernel")
}

## End(Not run)

bayestestR documentation built on April 7, 2023, 5:09 p.m.