simData2: Synthetic data generator 2 In fabMix: Overfitting Bayesian Mixtures of Factor Analyzers with Parsimonious Covariance and Unknown Number of Components

Description

Simulate data from a multivariate normal mixture using a mixture of factor analyzers mechanism.

Usage

 1 simData2(sameSigma, p, q, K.true, n, loading_means, loading_sd, sINV_values) 

Arguments

 sameSigma Logical. p The dimension of the multivariate normal distribution (p > 1). q Number of factors. It should be strictly smaller than p. K.true The number of mixture components (clusters). n Sample size. loading_means A vector which contains the means of blocks of factor loadings. Default: loading_means = c(-30,-20,-10,10, 20, 30). loading_sd A vector which contains the standard deviations of blocks of factor loadings. Default: loading_sd <- rep(2, length(loading_means)). sINV_values A vector which contains the values of the diagonal of the (common) inverse covariance matrix, if sigmaTrue = TRUE. An K\times p matrix which contains the values of the diagonal of the inverse covariance matrix per component, if sigmaTrue = FALSE. Default:  sINV_values = rgamma(p, shape = 1, rate = 1).

Value

A list with the following entries:

 data n\times p array containing the simulated data. class n-dimensional vector containing the class of each observation. factorLoadings K.true\times p \times q-array containing the factor loadings Λ_{krj} per cluster k, feature r and factor j, where k=1,…,K; r=1,…,p; j=1,…,q. means K.true\times p matrix containing the marginal means μ_{kr}, k=1,…,K; r=1,…,p. variance p\times p diagonal matrix containing the variance of errors σ_{rr}, r=1,…,p. Note that the same variance of errors is assumed for each cluster. factors n\times q matrix containing the simulated factor values. weights K.true-dimensional vector containing the weight of each cluster.

Note

The marginal variance for cluster k is equal to Λ_kΛ_k^{T} + Σ.

Author(s)

Panagiotis Papastamoulis

Examples

  1 2 3 4 5 6 7 8 9 10 11 12 library('fabMix') n = 8 # sample size p = 5 # number of variables q = 2 # number of factors K = 2 # true number of clusters sINV_diag = 1/((1:p)) # diagonal of inverse variance of errors set.seed(100) syntheticDataset <- simData2(K.true = K, n = n, q = q, p = p, sINV_values = sINV_diag) summary(syntheticDataset) 

fabMix documentation built on Feb. 20, 2020, 1:09 a.m.