sampleLCAMixture: Telescoping sampling of the mixture of LCA models where a...
In telescope: Bayesian Mixtures with an Unknown Number of Components

sampleLCAMixture

R Documentation

Telescoping sampling of the mixture of LCA models where a prior on the number of components K is specified.

Description

The MCMC scheme is implemented as suggested in Malsiner-Walli et al (2024).
Also the priors on the model parameters are specified as in Malsiner-Walli et al (2024), see the vignette for details and notation.

Usage

sampleLCAMixture(
  y,
  S,
  L,
  pi,
  eta,
  mu,
  phi,
  a_00,
  a_mu,
  a_phi,
  b_phi,
  c_phi,
  d_phi,
  M,
  burnin,
  thin,
  Kmax,
  s_mu,
  s_phi,
  eps,
  G,
  priorOnWeights,
  d0,
  priorOnK,
  verbose = FALSE
)

Arguments

`y`	A numeric matrix; containing the data where categories are coded with numbers.
`S`	A numeric matrix; containing the initial cluster assignments.
`L`	A numeric scalar; specifiying the number of classes within each component.
`pi`	A numeric matrix; containing the initial class-specific occurrence probabilities.
`eta`	A numeric vector; containing the initial cluster sizes.
`mu`	A numeric matrix; containing the initial central component occurrence probabilities.
`phi`	A numeric matrix; containing the initial component- and variable-specific precisions.
`a_00`	A numeric scalar; specifying the prior parameter a_00.
`a_mu`	A numeric vector; containing the prior parameter a_mu.
`a_phi`	A numeric vector; containing the prior parameter a_phi for each variable.
`b_phi`	A numeric vector; containing the initial value of b_phi for each variable.
`c_phi`	A numeric vector; containing the prior parameter c_phi for each variable.
`d_phi`	A numeric vector; containing the prior parameter d_phi for each variable.
`M`	A numeric scalar; specifying the number of recorded iterations.
`burnin`	A numeric scalar; specifying the number of burn-in iterations.
`thin`	A numeric scalar; specifying the thinning used for the iterations.
`Kmax`	A numeric scalar; the maximum number of components.
`s_mu`	A numeric scalar; specifying the standard deviation of the proposal in the Metropolis-Hastings step when sampling mu.
`s_phi`	A numeric scalar; specifying the standard deviation of the proposal in the Metropolis-Hastings step when sampling phi.
`eps`	A numeric scalar; a regularizing constant to bound the Dirichlet proposal away from the boundary in the Metropolis-Hastings step when sampling mu.
`G`	A character string; either `"MixDynamic"` or `"MixStatic"`.
`priorOnWeights`	A named list; providing the prior on the mixture weights.
`d0`	A numeric scalar; containing the Dirichlet prior parameter on the class weights.
`priorOnK`	A named list; providing the prior on the number of components K, see `priorOnK_spec()`.
`verbose`	A logical; indicating if some intermediate clustering results should be printed.

Value

A named list containing:

"Eta": sampled weights.
"S": sampled assignments.
"K": sampled number of components.
"Kplus": number of filled, i.e., non-empty components, for each iteration.
"Nk": number of observations assigned to the different components, for each iteration.
"Nl": number of observations assigned to the different classes within the components, for each iteration.
"e0": sampled Dirichlet parameter of the prior on the weights (if e_0 is random).
"alpha": sampled Dirichlet parameter of the prior on the weights (if \alpha is random).
"acc": logical vector indicating acceptance in the Metropolis-Hastings step when sampling either e_0 or \alpha.
"Mu": sampled central component occurrence probabilities.
"Phi": sampled precisions.
"acc_mu": the acceptance rate in the Metropolis-Hastings step when sampling \mu_{k,j}.
"acc_phi": the acceptance rate in the Metropolis-Hastings step when sampling \phi_{k,j}.
"nonnormpost_mode": parameter values corresponding to the mode of the nonnormalized posterior.
"Pi_k": sampled weighted component occurrence probabilities.

Examples

data("SimData", package = "telescope")
y <- as.matrix(SimData[, 1:30])
z <- SimData[, 31]
N <- nrow(y)
r <- ncol(y)
    
M <- 5
thin <- 1
burnin <- 0
Kmax <- 50  
Kinit <- 10
    
G <- "MixDynamic"
priorOnAlpha <- priorOnAlpha_spec("gam_1_2")
priorOnK <- priorOnK_spec("Pois_1")
d0 <- 1  

cat <- apply(y, 2, max)
a_mu <- rep(20, sum(cat))
mu_0 <- matrix(rep(rep(1/cat, cat), Kinit),
  byrow = TRUE, nrow = Kinit)

c_phi <- 30; d_phi <- 1; b_phi <- rep(10, r)
a_phi <- rep(1, r)
phi_0 <- matrix(cat, Kinit, r, byrow = TRUE)

a_00 <- 0.05

s_mu <- 2; s_phi <- 2; eps <- 0.01 

set.seed(1234)
cl_y <- kmeans(y, centers = Kinit, nstart = 100, iter.max = 50)
S_0 <- cl_y$cluster
eta_0 <- cl_y$size/N

I_0 <- rep(1L, N)
L <- 2
for (k in 1:Kinit) {
  cl_size <- sum(S_0 == k)
  I_0[S_0 == k] <- rep(1:L, length.out = cl_size)
}

index <- c(0, cumsum(cat))
low <- (index + 1)[-length(index)]
up <- index[-1]

pi_km <- array(NA_real_, dim = c(Kinit, L, sum(cat)))
rownames(pi_km) <- paste0("k_", 1:Kinit)
for (k in 1:Kinit) {
  for (l in 1:L) {
    index <- (S_0 == k) & (I_0 == l)
    for (j in 1:r) {
      pi_km[k, l, low[j]:up[j]] <- tabulate(y[index, j], cat[j]) / sum(index)
    }
  }
}
pi_0 <- pi_km 

result <- sampleLCAMixture(
    y, S_0, L,
    pi_0, eta_0, mu_0, phi_0,
    a_00, a_mu, a_phi, b_phi, c_phi, d_phi,
    M, burnin, thin, Kmax,
    s_mu, s_phi, eps,
    G, priorOnAlpha, d0, priorOnK)

telescope documentation built on April 4, 2025, 2:40 a.m.