simulateData  R Documentation 
Simulate Data From a Lavaan Model Syntax
Description
Simulate data starting from a lavaan model syntax.
Usage
simulateData(model = NULL, model.type = "sem", meanstructure = FALSE,
int.ov.free = TRUE, int.lv.free = FALSE, conditional.x = FALSE,
fixed.x = FALSE,
orthogonal = FALSE, std.lv = TRUE, auto.fix.first = FALSE,
auto.fix.single = FALSE, auto.var = TRUE, auto.cov.lv.x = TRUE,
auto.cov.y = TRUE, ..., sample.nobs = 500L, ov.var = NULL,
group.label = paste("G", 1:ngroups, sep = ""), skewness = NULL,
kurtosis = NULL, seed = NULL, empirical = FALSE,
return.type = "data.frame", return.fit = FALSE,
debug = FALSE, standardized = FALSE)
Arguments
model 
A description of the userspecified model. Typically, the model
is described using the lavaan model syntax. See
model.syntax for more information. Alternatively, a
parameter table (eg. the output of the lavaanify() function) is also
accepted.

model.type 
Set the model type: possible values
are "cfa" , "sem" or "growth" . This may affect
how starting values are computed, and may be used to alter the terminology
used in the summary output, or the layout of path diagrams that are
based on a fitted lavaan object.

meanstructure 
If TRUE , the means of the observed
variables enter the model. If "default" , the value is set based
on the userspecified model, and/or the values of other arguments.

int.ov.free 
If FALSE , the intercepts of the observed variables
are fixed to zero.

int.lv.free 
If FALSE , the intercepts of the latent variables
are fixed to zero.

conditional.x 
If TRUE , we set up the model conditional on
the exogenous ‘x’ covariates; the modelimplied sample statistics
only include the nonx variables. If FALSE , the exogenous ‘x’
variables are modeled jointly with the other variables, and the
modelimplied statistics refect both sets of variables. If
"default" , the value is set depending on the estimator, and
whether or not the model involves categorical endogenous variables.

fixed.x 
If TRUE , the exogenous ‘x’ covariates are considered
fixed variables and the means, variances and covariances of these variables
are fixed to their sample values. If FALSE , they are considered
random, and the means, variances and covariances are free parameters. If
"default" , the value is set depending on the mimic option.

orthogonal 
If TRUE , the exogenous latent variables
are assumed to be uncorrelated.

std.lv 
If TRUE , the metric of each latent variable is
determined by fixing their variances to 1.0. If FALSE , the metric
of each latent variable is determined by fixing the factor loading of the
first indicator to 1.0.

auto.fix.first 
If TRUE , the factor loading of the first indicator
is set to 1.0 for every latent variable.

auto.fix.single 
If TRUE , the residual variance (if included)
of an observed indicator is set to zero if it is the only indicator of a
latent variable.

auto.var 
If TRUE , the residual variances and the variances
of exogenous latent variables are included in the model and set free.

auto.cov.lv.x 
If TRUE , the covariances of exogenous latent
variables are included in the model and set free.

auto.cov.y 
If TRUE , the covariances of dependent variables
(both observed and latent) are included in the model and set free.

... 
additional arguments passed to the lavaan
function.

sample.nobs 
Number of observations. If a vector, multiple datasets
are created. If return.type = "matrix" or
return.type = "cov" , a list of length(sample.nobs)
is returned, with either the data or covariance matrices, each one
based on the number of observations as specified in sample.nobs .
If return.type = "data.frame" , all datasets are merged and
a group variable is added to mimic a multiple group dataset.

ov.var 
The userspecified variances of the observed variables.

group.label 
The group labels that should be used if multiple
groups are created.

skewness 
Numeric vector. The skewness values for the observed variables. Defaults to zero.

kurtosis 
Numeric vector. The kurtosis values for the observed variables. Defaults to zero.

seed 
Set random seed.

empirical 
Logical. If TRUE , the implied moments (Mu and Sigma)
specify the empirical not population mean and covariance matrix.

return.type 
If "data.frame" , a data.frame is returned. If
"matrix" , a numeric matrix is returned (without any variable names).
If "cov" , a covariance matrix is returned (without any variable
names).

return.fit 
If TRUE , return the fitted model that has been used
to generate the data as an attribute (called "fit" ); this
may be useful for inspection.

debug 
If TRUE , debugging information is displayed.

standardized 
If TRUE , the residual variances of the observed
variables are set in such a way such that the model implied variances
are unity. This allows regression coefficients and factor loadings
(involving observed variables) to be specified in a standardized metric.

Details
Model parameters can be specified by fixed values in the lavaan
model syntax. If no fixed values are specified, the value zero will be
assumed, except for factor loadings and variances, which are set to unity
by default. By default, multivariate normal data are generated. However,
by providing skewness and/or kurtosis values, nonnormal multivariate data
can be generated, using the Vale & Maurelli (1983) method.
Value
The generated data. Either as a data.frame
(if return.type="data.frame"
),
a numeric matrix (if return.type="matrix"
),
or a covariance matrix (if return.type="cov"
).
Examples
# specify population model
population.model < ' f1 =~ x1 + 0.8*x2 + 1.2*x3
f2 =~ x4 + 0.5*x5 + 1.5*x6
f3 =~ x7 + 0.1*x8 + 0.9*x9
f3 ~ 0.5*f1 + 0.6*f2
'
# generate data
set.seed(1234)
myData < simulateData(population.model, sample.nobs=100L)
# population moments
fitted(sem(population.model))
# sample moments
round(cov(myData), 3)
round(colMeans(myData), 3)
# fit model
myModel < ' f1 =~ x1 + x2 + x3
f2 =~ x4 + x5 + x6
f3 =~ x7 + x8 + x9
f3 ~ f1 + f2 '
fit < sem(myModel, data=myData)
summary(fit)