piv_sim: Generate Data from a Gaussian Nested Mixture
In pivmet: Pivotal Methods for Bayesian Relabelling and k-Means Clustering

piv_sim

R Documentation

Generate Data from a Gaussian Nested Mixture

Description

Simulate N observations from a nested Gaussian mixture model with k pre-specified components under uniform group probabilities 1/k, where each group is in turn drawn from a further level consisting of two subgroups.

Usage

piv_sim(
  N,
  k,
  Mu,
  stdev,
  Sigma.p1 = diag(2),
  Sigma.p2 = 100 * diag(2),
  W = c(0.5, 0.5)
)

Arguments

`N`	The desired sample size.
`k`	The desired number of mixture components.
`Mu`	The input mean vector of length `k` for univariate Gaussian mixtures; the input `k \times D` matrix with the means' coordinates for multivariate Gaussian mixtures.
`stdev`	For univariate mixtures, the `k \times 2` matrix of input standard deviations, where the first column contains the parameters for subgroup 1, and the second column contains the parameters for subgroup 2.
`Sigma.p1`	The `D \times D` covariance matrix for the first subgroup. For multivariate mixtures only.
`Sigma.p2`	The `D \times D` covariance matrix for the second subgroup. For multivariate mixtures only.
`W`	The vector for the mixture weights of the two subgroups.

Details

The functions allows to simulate values from a double (nested) univariate Gaussian mixture:

(Y_i|Z_i=j) \sim \sum_{s=1}^{2} p_{js}\, \mathcal{N}(\mu_{j}, \sigma^{2}_{js}),

or from a multivariate nested Gaussian mixture:

(Y_i|Z_i=j) \sim \sum_{s=1}^{2} p_{js}\, \mathcal{N}_{D}(\bm{\mu}_{j}, \Sigma_{s}),

where \sigma^{2}_{js} is the variance for the group j and the subgroup s (stdev is the argument for specifying the k x 2 standard deviations for univariate mixtures); \Sigma_s is the covariance matrix for the subgroup s, s=1,2, where the two matrices are specified by Sigma.p1 and Sigma.p2 respectively; \mu_j and \bm{\mu}_j, \ j=1,\ldots,k are the mean input vector and matrix respectively, specified by the argument Mu; W is a vector of dimension 2 for the subgroups weights.

Value

`y`	The `N` simulated observations.
`true.group`	A vector of integers from `1:k` indicating the values of the latent variables `Z_i`.
`subgroups`	A `k \times N` matrix where each row contains the index subgroup for the observations in the `k`-th group.

Examples


# Bivariate mixture simulation with three components

N  <- 2000
k  <- 3
D <- 2
M1 <- c(-45,8)
M2 <- c(45,.1)
M3 <- c(100,8)
Mu <- rbind(M1,M2,M3)
Sigma.p1 <- diag(D)
Sigma.p2 <- 20*diag(D)
W   <- c(0.2,0.8)
sim <- piv_sim(N = N, k = k, Mu = Mu, Sigma.p1 = Sigma.p1,
Sigma.p2 = Sigma.p2, W = W)
graphics::plot(sim$y, xlab="y[,1]", ylab="y[,2]")

pivmet documentation built on June 22, 2024, 9:29 a.m.