nClusters: Prior pmf of the number of data clusters for three model...

Description Usage Arguments Value References Examples

View source: R/fipp.R

Description

nClusters is a closure that returns a function which computes a table of probability masses for specified K+s. Arguments needed for the returned function to evaluate are: prior distribution of the number of mixture components and its parameters (see examples for details).

Usage

1
2
3
4
5
6
7
8
9
nClusters(
  Kplus,
  N,
  type = c("DPM", "static", "dynamic"),
  alpha = NULL,
  gamma = NULL,
  maxK = NULL,
  log = FALSE
)

Arguments

Kplus

a numeric value or vector. All values must be positive integers (that is 1,2,...). It specifies the range of the number of data clusters the user wants to evaluate the prior probabilities on.

N

the number of observations in data

type

the type of model considered. Three models (static/dynamic MFMs and DPM) are supported.

alpha, gamma

hyperparameters for the symmetric Dirichlet prior. For static MFM, gamma should be specified, while alpha should be specified for all other models (that is, dynamic MFM and DPM).

maxK

the maximum number of K (= the number of mixture components) considered. Only needed for static/dynamic MFMs.

log

logical, indicating whether the returned probability should be logged or not

Value

nClusters returns a function which takes two arguments:

priorK

a function with support on the positive integers. The function serves as a prior on K (default = NULL which is for the DPM).

priorKparams

a named list of prior parameters for the function supplied in argument priorK (default = NULL which is for the DPM).

References

Greve, J., Grün, B., Malsiner-Walli, G., and Frühwirth-Schnatter, S. (2020) Spying on the Prior of the Number of Data Clusters and the Partition Distribution in Bayesian Cluster Analysis. https://arxiv.org/abs/2012.12337

Escobar, M. D., and West, M. (1995) Bayesian Density Estimation and Inference Using Mixtures. Journal of the American Statistical Association 90 (430), Taylor & Francis: 577-–88. https://www.tandfonline.com/doi/abs/10.1080/01621459.1995.10476550

Miller, J. W., and Harrison, M. T. (2018) Mixture Models with a Prior on the Number of Components. Journal of the American Statistical Association 113 (521), Taylor & Francis: 340-–56. https://www.tandfonline.com/doi/full/10.1080/01621459.2016.1255636

Frühwirth-Schnatter, S., Malsiner-Walli, G., and Grün, B. (2020) Generalized mixtures of finite mixtures and telescoping sampling https://arxiv.org/abs/2005.09918

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
## first, create the function pmf() for the dynamic MFM
## with N = 100, K+ evaluated between 1 and 15 with alpha = 1,
## we assume that K will be smaller than 30 by setting maxK  = 30,
## please increase this value for more realistic analysis.
pmf <- nClusters(Kplus = 1:15, N = 100, type = "dynamic",
alpha = 1, maxK = 30)

## then, specifiy the prior for K so that the pmf can be evaluated
## between K+ = 1 and K+ = 15
pmf(dgeom, list(prob = 0.1))

## we can also compare this result with a different prior setting
pmf(dpois, list(lambda = 1))

fipp documentation built on Feb. 11, 2021, 5:07 p.m.