Description Usage Arguments Details Value Examples
Generate data based on the parameters of a structural equation model in lavaan model syntax.
1 2 3 4 5 6 7 8 9 10 |
.model |
A model in lavaan model syntax. |
.empirical |
Logical. If |
.handle_negative_definite |
Character string. How should negative definite
indicator correlation matrices be handled? One of |
.return_type |
Character string. One of |
.N |
Integer. The number of observations to generate. Ignored if
|
.skewness |
List. List of predefined values for the skewness of the indicators. |
.kurtosis |
List. List of predefined values for the kurtosis of the indicators. |
... |
|
Generate data for structural equation models including up to 8 constructs if a structural model is given or an unlimited number if only the correlation between constructs is needed. To be precise, if users specify a structural model we support a maximum of 5 exogenous constructs. Depending on the number of exogenous constructs the following number of endogenous constructs is allowed:
If there is 1 exogenous construct : a maximum of 7 endogenous constructs is allowed
If there are 2 exogenous constructs: a maximum of 6 endogenous constructs is allowed
If there are 3 exogenous constructs: a maximum of 5 endogenous constructs is allowed
If there are 4 exogenous constructs: a maximum of 4 endogenous constructs is allowed
If there are 5 exogenous constructs: a maximum of 4 endogenous constructs is allowed
The reason for the limitation is that data is generated such that the model-implied variances of the constructs are always unity. Since the model-implied construct covariance matrix is a complex function of the structural residual variances which are in turn a complex function of the path coefficients the equation for each construct variance grows massively with each additional construct added. Since for a given number of constructs the number of possible model specifications grows rapidly, we solved the variance equations symbolically as a function of the path coefficients in Mathematica. With more than 8 constructs the size of these symbolic representation becomes computationally infeasible.
Generation is based on parameter values given in lavaan model syntax. Currently, linear models and models containing second order constructs are supported. Supplying a model containing nonlinear terms causes an error.
For the structural model equations (~
) values are interpreted as path coefficients.
For measurement model equations values are taken to be loadings if the
concept is modeled as a common factor (=~
). If the concept is modeled as
a composite (<~
) values are interpreted as (unscaled) weights!
In the latter case, indicators are allowed to be arbitrarily correlated. Hence,
the correlation between indicators needs to be set as well. Indicator correlations
measurement error correlations, and correlations between exogenous constructs
are set using the (~~
) operator. Note that when writing, for instance, x1 ~~ 0.2*x2
(where x1
and x2
are indicators of some construct eta1
), the interpretation
depends on whether eta1
is modeled as a composite or a common factor.
In the former case x1 ~~ 0.2*x2
is a correlation between indicators, in the
latter case it is interpreted as a measurement error correlation.
In addition to supplying numeric values, variable values for parameters are allowed.
To achieve this, the package makes use of lavaan's
labeling capabilities. Users may replace a given parameter in, i.e. the structural model
by a symbolic name and assign a vector of values to that name by passing a
"name" = vector_of_values
argument to generateData()
. These values will be used
to generate data for all possible combinations of these values with the
remaining fixed parameters.
If .return_type
is "data.frame"
or "matrix"
normally distributed data
with zero mean and variance-covariance matrix equal to the indicator correlation
matrix which would be returned if .return_type = "cor"
(i.e., the population
indicator correlation matrix) is generated.
The generated data. Either as a data.frame (return_type = "data.frame"
),
a numeric matrix (return.type = "matrix"
),
or a correlation matrix (return.type = "cor"
). If variable parameters
have been set a nested tibble is returned.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 | # ==============================================================================
# Without variable parameters
# ==============================================================================
## DGP with constructs modeled as common factors
dgp <- "
# Structural model
eta2 ~ 0.4*eta1
eta3 ~ 0.4*eta1 + 0.35*eta2
# Measurement model
eta1 =~ 0.8*y11 + 0.9*y12 + 0.8*y13
eta2 =~ 0.7*y21 + 0.7*y22 + 0.9*y23
eta3 =~ 0.9*y31 + 0.8*y32 + 0.7*y33
"
dat <- generateData(dgp, .return_type = "cor")
dat
## DGP with a construct modeled as a composite
# If the model contains composites, within-block indicator correlation
# needs to be set as well.
dgp <- "
# Structural model
eta2 ~ 0.2*eta1
eta3 ~ 0.4*eta1 + 0.35*eta2
# Measurement model
eta1 <~ 0.7*y11 + 0.9*y12 + 0.8*y13
eta2 =~ 0.7*y21 + 0.7*y22 + 0.9*y23
eta3 =~ 0.9*y31 + 0.8*y32 + 0.7*y33
# Within block indicator correlation of eta1
y11 ~~ 0.2*y12
y11 ~~ 0.3*y13
y12 ~~ 0.5*y13
"
dat <- generateData(dgp, .return_type = "matrix")
dat[1:4, ]
# ==============================================================================
# With variable parameters
# ==============================================================================
### Linear DGP -----------------------------------------------------------------
# Add a label and assign values to for each name
dgp <- "
# Structural model
eta2 ~ 0.2*eta1
eta3 ~ gamma*eta1 + 0.35*eta2
# Measurement model
eta1 <~ 0.7*y11 + 0.9*y12 + 0.8*y13
eta2 =~ 0.7*y21 + 0.7*y22 + 0.9*y23
eta3 =~ 0.9*y31 + 0.8*y32 + 0.7*y33
# Within block indicator correlation
y11 ~~ 0.2*y12
y11 ~~ 0.3*y13
y12 ~~ epsilon*y13
"
dat <- generateData(dgp,
"gamma" = c(-0.4, -0.2, 0, 0.2, 0.4),
"epsilon" = c(0.1, 0.2, 0.3), .return_type = "data.frame")
dat
### DGP containing a second order construct ------------------------------------
# Second order constructs are supported as well.
dgp_2ndorder <- "
## Path model / Regressions
eta2 ~ 0.5*eta1
eta3 ~ 0.35*eta1 + 0.4*eta2
## Composite model
eta1 <~ 0.8*y41 + 0.6*y42 + 0.6*y43
eta2 <~ 2*y51 + 3*y52 + 5*y53
c1 <~ 0.8*y11 + 0.4*y12
c2 <~ 0.5*y21 + 0.3*y22 + 0.2*y23 + 0.4*y24
## Higher order composite
eta3 <~ 0.4*c1 + 0.4*c2
## Composite indicator correlations
# eta1
y41 ~~ 0.5*y42
y41 ~~ 0.5*y43
y42 ~~ 0.5*y43
# eta2
y51 ~~ 0.2*y52
y51 ~~ 0.3*y53
y52 ~~ 0.4*y53
# eta3 (the 2nd order construct)
c1 ~~ 0.49*c2
# c1-c2
y11 ~~ 0.3125*y12
y21 ~~ 0.4*y22
y21 ~~ 0.3*y23
y21 ~~ 0.31*y24
y22 ~~ 0.28*y23
y22 ~~ 0.31*y24
y23 ~~ 0.3*y24
"
dat <- generateData(dgp_2ndorder, .return_type = "data.frame", .empirical = TRUE)
dat[1:5, ]
## Estimate using cSEM
require(cSEM)
aa <- cSEM::csem(dat, dgp_2ndorder)
cSEM::summarize(aa) ## parameters estimates are identical to the DGP
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.