simData2: Synthetic data generator 2

Description Usage Arguments Value Note Author(s) Examples

View source: R/fabMix.R

Description

Simulate data from a multivariate normal mixture using a mixture of factor analyzers mechanism.

Usage

1
simData2(sameSigma,  p, q, K.true, n, loading_means, loading_sd, sINV_values)

Arguments

sameSigma

Logical.

p

The dimension of the multivariate normal distribution (p > 1).

q

Number of factors. It should be strictly smaller than p.

K.true

The number of mixture components (clusters).

n

Sample size.

loading_means

A vector which contains the means of blocks of factor loadings.

Default: loading_means = c(-30,-20,-10,10, 20, 30).

loading_sd

A vector which contains the standard deviations of blocks of factor loadings.

Default: loading_sd <- rep(2, length(loading_means)).

sINV_values

A vector which contains the values of the diagonal of the (common) inverse covariance matrix, if sigmaTrue = TRUE. An K\times p matrix which contains the values of the diagonal of the inverse covariance matrix per component, if sigmaTrue = FALSE.

Default: sINV_values = rgamma(p, shape = 1, rate = 1).

Value

A list with the following entries:

data

n\times p array containing the simulated data.

class

n-dimensional vector containing the class of each observation.

factorLoadings

K.true\times p \times q-array containing the factor loadings Λ_{krj} per cluster k, feature r and factor j, where k=1,…,K; r=1,…,p; j=1,…,q.

means

K.true\times p matrix containing the marginal means μ_{kr}, k=1,…,K; r=1,…,p.

variance

p\times p diagonal matrix containing the variance of errors σ_{rr}, r=1,…,p. Note that the same variance of errors is assumed for each cluster.

factors

n\times q matrix containing the simulated factor values.

weights

K.true-dimensional vector containing the weight of each cluster.

Note

The marginal variance for cluster k is equal to Λ_kΛ_k^{T} + Σ.

Author(s)

Panagiotis Papastamoulis

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
library('fabMix')

n = 8                # sample size
p = 5                # number of variables
q = 2                # number of factors
K = 2                # true number of clusters

sINV_diag = 1/((1:p))    # diagonal of inverse variance of errors
set.seed(100)
syntheticDataset <- simData2(K.true = K, n = n, q = q, p = p, 
                        sINV_values = sINV_diag)
summary(syntheticDataset)

fabMix documentation built on Feb. 20, 2020, 1:09 a.m.