pp_mixture.brmsfit: Posterior Probabilities of Mixture Component Memberships
In brms: Bayesian Regression Models using 'Stan'

pp_mixture.brmsfit

R Documentation

Posterior Probabilities of Mixture Component Memberships

Description

Compute the posterior probabilities of mixture component memberships for each observation including uncertainty estimates.

Usage

## S3 method for class 'brmsfit'
pp_mixture(
  x,
  newdata = NULL,
  re_formula = NULL,
  resp = NULL,
  ndraws = NULL,
  draw_ids = NULL,
  log = FALSE,
  summary = TRUE,
  robust = FALSE,
  probs = c(0.025, 0.975),
  ...
)

pp_mixture(x, ...)

Arguments

`x`	An R object usually of class `brmsfit`.
`newdata`	An optional data.frame for which to evaluate predictions. If `NULL` (default), the original data of the model is used. `NA` values within factors (excluding grouping variables) are interpreted as if all dummy variables of this factor are zero. This allows, for instance, to make predictions of the grand mean when using sum coding. `NA` values within grouping variables are treated as a new level.
`re_formula`	formula containing group-level effects to be considered in the prediction. If `NULL` (default), include all group-level effects; if `NA` or `~0`, include no group-level effects.
`resp`	Optional names of response variables. If specified, predictions are performed only for the specified response variables.
`ndraws`	Positive integer indicating how many posterior draws should be used. If `NULL` (the default) all draws are used. Ignored if `draw_ids` is not `NULL`.
`draw_ids`	An integer vector specifying the posterior draws to be used. If `NULL` (the default), all draws are used.
`log`	Logical; Indicates whether to return probabilities on the log-scale.
`summary`	Should summary statistics be returned instead of the raw values? Default is `TRUE`.
`robust`	If `FALSE` (the default) the mean is used as the measure of central tendency and the standard deviation as the measure of variability. If `TRUE`, the median and the median absolute deviation (MAD) are applied instead. Only used if `summary` is `TRUE`.
`probs`	The percentiles to be computed by the `quantile` function. Only used if `summary` is `TRUE`.
`...`	Further arguments passed to `prepare_predictions` that control several aspects of data validation and prediction.

Details

The returned probabilities can be written as P(K_n = k | Y_n), that is the posterior probability that observation n originates from component k. They are computed using Bayes' Theorem

P(K_n = k | Y_n) = P(Y_n | K_n = k) P(K_n = k) / P(Y_n),

where P(Y_n | K_n = k) is the (posterior) likelihood of observation n for component k, P(K_n = k) is the (posterior) mixing probability of component k (i.e. parameter theta<k>), and

P(Y_n) = \sum_{k=1,...,K} P(Y_n | K_n = k) P(K_n = k)

is a normalizing constant.

Value

If summary = TRUE, an N x E x K array, where N is the number of observations, K is the number of mixture components, and E is equal to length(probs) + 2. If summary = FALSE, an S x N x K array, where S is the number of posterior draws.

Examples

## Not run: 
## simulate some data
set.seed(1234)
dat <- data.frame(
  y = c(rnorm(100), rnorm(50, 2)),
  x = rnorm(150)
)
## fit a simple normal mixture model
mix <- mixture(gaussian, nmix = 2)
prior <- c(
  prior(normal(0, 5), Intercept, nlpar = mu1),
  prior(normal(0, 5), Intercept, nlpar = mu2),
  prior(dirichlet(2, 2), theta)
)
fit1 <- brm(bf(y ~ x), dat, family = mix,
            prior = prior, chains = 2, init = 0)
summary(fit1)

## compute the membership probabilities
ppm <- pp_mixture(fit1)
str(ppm)

## extract point estimates for each observation
head(ppm[, 1, ])

## classify every observation according to
## the most likely component
apply(ppm[, 1, ], 1, which.max)

## End(Not run)

brms documentation built on Sept. 23, 2024, 5:08 p.m.