| sim_IMIFA | R Documentation | 
Functions to simulate data of any size and dimension from a (infinite) mixture of (infinite) factor analysers parameterisation or fitted object.
sim_IMIFA_data(N = 300L,
               G = 3L,
               P = 50L,
               Q = rep(floor(log(P)), G),
               pis = rep(1/G, G),
               mu = NULL,
               psi = NULL,
               loadings = NULL,
               scores = NULL,
               nn = NULL,
               loc.diff = 2,
               non.zero = P,
               forceQg = TRUE,
               method = c("conditional", "marginal"))
sim_IMIFA_model(res,
                method = c("conditional", "marginal"))
| N, G, P | Desired overall number of observations, number of clusters, and number of variables in the simulated data set. All must be a single integer. | 
| Q | Desired number of cluster-specific latent factors in the simulated data set. Can be specified either as a single integer if all clusters are to have the same number of factors, or a vector of length  | 
| pis | Mixing proportions of the clusters in the data set if  | 
| mu | True values of the mean parameters, either as a single value, a vector of length  | 
| psi | True values of uniqueness parameters, either as a single value, a vector of length  | 
| loadings | True values of the loadings matrix/matrices. Must be supplied in the form of a list of numeric matrices when  | 
| scores | True values of the latent factor scores, as a  | 
| nn | An alternative way to specify the size of each cluster, by giving the exact number of observations in each cluster explicitly. Must sum to  | 
| loc.diff | A parameter to control the closeness of the clusters in terms of the difference in their location vectors. Only relevant if  More specifically,  
 | 
| non.zero | Controls the number of non-zero entries in each loadings column (per cluster) only when  Must be given as a list of length  | 
| forceQg | A logical indicating whether the upper limit on the number of cluster-specific factors  | 
| method | A switch indicating whether the mixture to be simulated from is the conditional distribution of the data given the latent variables (default), or simply the marginal distribution of the data. | 
| res | An object of class  | 
sim_IMIFA_model is a simple wrapper to sim_IMIFA_data which uses the estimated parameters of a fitted IMIFA related model, as generated by get_IMIFA_results. The necessary parameters must have been originally stored via storeControl in the creation of res.
Invisibly returns a data.frame with N observations (rows) of P variables (columns). The true values of the parameters which generated these data are also stored as attributes.
N, G, P & Q will NOT be inferred from the supplied parameters pis, mu, psi, loadings, scores & nn - rather, the parameters' length/dimensions must adhere to the supplied values of N, G, P & Q.
Missing values are not allowed in any of pis, mu, psi, loadings, scores & nn.
Keefe Murphy - <keefe.murphy@mu.ie>
Murphy, K., Viroli, C., and Gormley, I. C. (2020) Infinite mixtures of infinite factor analysers, Bayesian Analysis, 15(3): 937-963. <\Sexpr[results=rd]{tools:::Rd_expr_doi("10.1214/19-BA1179")}>.
mcmc_IMIFA for fitting an IMIFA related model to the simulated data set.
get_IMIFA_results for generating input for sim_IMIFA_model.
Ledermann for details on the upper-bound for Q. Note that this function accounts for isotropic uniquenesses, if psi is supplied in that manner, in computing this bound.
# Simulate 100 observations from 3 balanced clusters with cluster-specific numbers of latent factors
# Specify isotropic uniquenesses within each cluster
# Supply cluster means directly
sim_data  <- sim_IMIFA_data(N=100, G=3, P=20, Q=c(2, 2, 5), psi=1:3,
                            mu=matrix(rnorm(60, -2 + 1:3, 1), nrow=20, ncol=3, byrow=TRUE))
names(attributes(sim_data))
labels    <- attr(sim_data, "Labels")
# Visualise the data in two-dimensions
plot(cmdscale(dist(sim_data), k=2), col=labels)
# Examine the overlap with a pairs plot of 5 randomly chosen variables
pairs(sim_data[,sample(1:20, 5)], col=labels)
# Fit a MIFA model to this data
# tmp     <- mcmc_IMIFA(sim_data, method="MIFA", range.G=3, n.iters=5000)
# Simulate from this model
# res     <- get_IMIFA_results(tmp, zlabels=labels)
# sim_mod <- sim_IMIFA_model(res)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.