generate: Generate data using SimSem template

Description Usage Arguments Details Value Author(s) References See Also Examples

View source: R/generate.R

Description

This function can be used to generate random data based on the 1. SimSem objects created with the model function, 2. lavaan script or parameter tables, or 3. an MxModel object from the OpenMx package. Some notable features include fine control of misspecification and misspecification optimization (for SimSem only), as well as the ability to generate non-normal data. When using simsem for simulations, this function is used internally to generate data in the function sim, and can be helpful for debugging, or in creating data for use with other analysis programs.

Usage

1
2
3
4
5
6
generate(model, n, maxDraw=50, misfitBounds=NULL, misfitType="f0",
	averageNumMisspec=FALSE, optMisfit=NULL, optDraws=50, 
	createOrder = c(1, 2, 3), indDist=NULL, sequential=FALSE,	
	facDist=NULL, errorDist=NULL, saveLatentVar = FALSE, indLab=NULL, 
	modelBoot=FALSE, realData=NULL, covData=NULL, params=FALSE, group = NULL, 
	empirical = FALSE, ...)

Arguments

model

A SimSem object, a lavaan script or parameter tables, or an MxModel object from the OpenMx package

n

Integer of sample size.

maxDraw

Integer specifying the maximum number of attempts to draw a valid set of parameters (no negative error variance, standardized coefficients over 1).

misfitBounds

Vector that contains upper and lower bounds of the misfit measure. Sets of parameters drawn that are not within these bounds are rejected.

misfitType

Character vector indicating the fit measure used to assess the misfit of a set of parameters. Can be "f0", "rmsea", "srmr", or "all".

averageNumMisspec

If TRUE, the provided fit will be divided by the number of misspecified parameters.

optMisfit

Character vector of either "min" or "max" indicating either maximum or minimum optimized misfit. If not null, the set of parameters out of the number of draws in "optDraws" that has either the maximum or minimum misfit of the given misfit type will be returned.

optDraws

Number of parameter sets to draw if optMisfit is not null. The set of parameters with the maximum or minimum misfit will be returned.

createOrder

The order of 1) applying equality/inequality constraints, 2) applying misspecification, and 3) fill unspecified parameters (e.g., residual variances when total variances are specified). The specification of this argument is a vector of different orders of 1 (constraint), 2 (misspecification), and 3 (filling parameters). For example, c(1, 2, 3) is to apply constraints first, then add the misspecification, and finally fill all parameters. See the example of how to use it in the draw function.

indDist

A SimDataDist object or list of objects for a distribution of indicators. If one object is passed, each indicator will have the same distribution. Use when sequential is FALSE.

sequential

If TRUE, use a sequential method to create data such that the data from factor are generated first and apply to a set of equations to obtain the data of indicators. If FALSE, create data directly from model-implied mean and covariance of indicators.

facDist

A SimDataDist object or list of objects for the distribution of factors. If one object is passed, all factors will have the same distribution. Use when sequential is TRUE.

errorDist

An object or list of objects of type SimDataDist indicating the distribution of errors. If a single SimDataDist is specified, each error will be genrated with that distribution.

saveLatentVar

If TRUE, the total latent variable scores, residual latent variable scores, and measurement error scores are also provided as the "latentVar" attribute of the generated data by the following line: attr(generatedData, "latentVar"). The sequential argument must be TRUE in order to use this option.

indLab

A vector of indicator labels. When not specified, the variable names are x1, x2, ... xN.

modelBoot

When specified, a model-based bootstrap is used for data generation. See details for further information. This argument requires real data to be passed to realData.

realData

A data.frame containing real data. The data generated will follow the distribution of this data set.

covData

A data.frame containing covariate data, which can have any distributions. This argument is required when users specify GA or KA matrices in the model template (SimSem).

params

If TRUE, return the parameters drawn along with the generated data set. Default is FALSE.

group

The label of the grouping variable

empirical

Logical. If TRUE, the specified parameters are treated as sample statistics and data are created to get the specified sample statistics. This argument is applicable when multivariate normal distribution is specified only.

...

Additional arguments for the simulateData function.

Details

If the lavaan script or the MxModel are provided, the model-implied covariance matrix will be computed and internally use createData function to generate data. The data-generation method is based on whether the indDist argument is specified. For the lavaan script, the code for data generation is modified from the simulateData function.

If the SimSem object is specified, it will check whether there are any random parameters or trivial misspecification in the model. If so, real or misspecified parameters are drawn via the draw function. Next, there are two methods to generate data. First, the function will calculate the model-implied covariance matrix (including model misspecification) and generate data similar to the lavaan script or the MxModel object. The second method is referred to as the sequential method, which can be used by specifying the sequential argument as TRUE. This function will create data based on the chain of equations in structural equation modeling such that independent variables and errors are generated and added as dependent variables and the dependent variables will be treated as independent variables in the next equation. For example, in the model with factor A and B are independent variables, factor C are dependent variables, factors A and B are generated first. Then, residual in factor C are created and added with factors A and B. This current step has all factor scores. Then, measurement errors are created and added with factor scores to create indicator scores. During each step, independent variables and errors can be nonnormal by setting facDist or errorDist arguments. The data generation in each step is based on the createData function.

For the model-based bootstrap (providing the realData argument), the transformation proposed by Yung & Bentler (1996) is used. This procedure is the expansion from the Bollen and Stine (1992) bootstrap including a mean structure. The model-implied mean vector and covariance matrix with trivial misspecification will be used in the model-based bootstrap if misspec is specified. See page 133 of Bollen and Stine (1992) for a reference.

Value

A data.frame containing simulated data from the data generation template. A variable "group" is appended indicating group membership.

Author(s)

Sunthud Pornprasertmanit (psunthud@gmail.com), Patrick Miller (University of Notre Dame; pmille13@nd.edu), the data generation code for lavaan script is modifed from the simulateData function in lavaan written by Yves Rosseel

References

Bollen, K. A., & Stine, R. A. (1992). Bootstrapping goodness-of-fit measures in structural equation models. Sociological Methods and Research, 21, 205-229.

Yung, Y.-F., & Bentler, P. M. (1996). Bootstrapping techniques in analysis of mean and covariance structures. In G. A. Marcoulides & R. E. Schumacker (Eds.), Advanced structural equation modeling: Issues and techniques (pp. 195-226). Mahwah, NJ: Erlbaum.

See Also

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
loading <- matrix(0, 6, 2)
loading[1:3, 1] <- NA
loading[4:6, 2] <- NA
LY <- bind(loading, 0.7)

latent.cor <- matrix(NA, 2, 2)
diag(latent.cor) <- 1
RPS <- binds(latent.cor, 0.5)

RTE <- binds(diag(6))

VY <- bind(rep(NA,6),2)

CFA.Model <- model(LY = LY, RPS = RPS, RTE = RTE, modelType = "CFA")

dat <- generate(CFA.Model, 200)

# Get the latent variable scores

dat2 <- generate(CFA.Model, 20, sequential = TRUE, saveLatentVar = TRUE)
dat2
attr(dat2, "latentVar")

Example output

Loading required package: lavaan
This is lavaan 0.5-23.1097
lavaan is BETA software! Please report any bugs.
 
###############################################################################################
This is simsem 0.5-13
simsem is BETA software! Please report any bugs.
simsem was developed at the University of Kansas Center for Research Methods and Data Analysis.
###############################################################################################
           y1          y2          y3           y4          y5          y6
1   0.8443860 -0.73575888 -0.73049155 -1.200449858  0.53226963 -0.29834750
2   0.5448885  0.11691534  1.31259779 -0.267795822 -0.05293283 -0.17357599
3  -0.8357018 -0.20941557 -0.63255949 -1.168607081 -0.80676273 -1.28979924
4  -0.1516895 -0.59599886 -1.02605144 -0.059276832  0.29099914  0.02404334
5  -0.6028919 -0.30593167 -0.47519600 -0.660720675  0.83312074 -0.06021502
6  -0.3058733 -0.74323399  1.07746834  1.443246841 -0.13732781  0.82590154
7   0.8864821 -0.39862202  1.26554421  0.679333323  0.10121993 -0.19202708
8   1.5096398  1.85807774  0.93868659  1.184593625  1.85304798 -1.35917558
9   1.4004187  0.82586548  0.03823611 -0.003962473 -0.09591234  0.33825241
10  0.8300961  0.01254225  2.77341025  1.224423735  0.57779402  0.87718097
11  0.9037436  2.29480798  1.89360820  2.171149946  1.34516626  1.03522749
12  0.2598985  0.51871655  1.16702138  0.096657083  2.23043410  1.59915258
13 -1.4359903 -1.19405821 -0.55134840 -0.730804967 -1.80206840 -0.88879261
14 -0.6084429 -0.24389084 -0.67142648 -0.597268054 -0.54016665  0.30720467
15 -0.3778234 -0.65945636 -1.16571211 -0.230542997 -0.53240041 -1.13940806
16 -0.8144979  0.74068535  0.18266126 -0.353206235 -0.71114091  1.33636206
17  0.7678043  0.56047148 -1.01164619  0.711098086  0.79303040 -0.74968518
18 -1.3141082 -1.71305999 -1.73330847 -1.248592166 -0.31847096 -0.56585634
19 -0.2123511  0.71444503  0.95419641 -0.906148498 -1.25470731 -0.03813275
20 -0.4949373  0.72654696  0.11521864  0.091020012  1.16755220  1.13654152
            f1          f2      res_y1      res_y2      res_y3      res_y4
1  -0.06574325  0.16097429  0.89040627 -0.68973861 -0.68447127 -1.31313186
2   0.35507556 -0.89181744  0.29633558 -0.13163755  1.06404490  0.35647639
3  -0.69684699 -1.41063985 -0.34790890  0.27837732 -0.14476659 -0.18115918
4  -1.38599918 -0.37631154  0.81850988  0.37420056 -0.05585201  0.20414124
5  -0.51160857  0.08312808 -0.24476594  0.05219432 -0.11707000 -0.71891033
6  -0.28845164  1.90385510 -0.10395714 -0.54131784  1.27938449  0.11054827
7   0.76531405  0.45768127  0.35076222 -0.93434186  0.72982438  0.35895643
8   0.94318918  0.75151050  0.84940739  1.19784531  0.27845416  0.65853628
9   0.52496572  0.08846552  1.03294268  0.45838948 -0.32923989 -0.06588833
10  1.21124498  0.86985977 -0.01777542 -0.83532924  1.92553876  0.61552190
11  2.41594299  2.56932085 -0.78741647  0.60364788  0.20244811  0.37262535
12  1.59551467  1.95976276 -0.85696174 -0.59814372  0.05016110 -1.27517685
13 -1.56798690 -1.94794593 -0.33839948 -0.09646738  0.54624243  0.63275718
14 -1.46751131 -0.71342058  0.41881500  0.78336708  0.35583143 -0.09787365
15 -1.44301090 -1.04221725  0.63228421  0.35065127 -0.15560448  0.49900908
16  0.44458692  0.16172445 -1.12570872  0.42947451 -0.12854958 -0.46641335
17 -0.38815244  0.69637008  1.03951102  0.83217819 -0.73993948  0.22363903
18 -1.36798049 -0.88484576 -0.35652181 -0.75547364 -0.77572213 -0.62920013
19 -0.17209478 -0.25057454 -0.09188480  0.83491138  1.07466275 -0.73074632
20  1.06626634 -0.31464383 -1.24132372 -0.01983948 -0.63116780  0.31127069
        res_y5      res_y6
1   0.41958762 -0.41102950
2   0.57133938  0.45069622
3   0.18068516 -0.30235134
4   0.55441721  0.28746142
5   0.77493109 -0.11840467
6  -1.47002638 -0.50679703
7  -0.21915696 -0.51240397
8   1.32699063 -1.88523292
9  -0.15783820  0.27632655
10 -0.03110782  0.26827913
11 -0.45335833 -0.76329711
12  0.85860016  0.22731864
13 -0.43850625  0.47476954
14 -0.04077225  0.80659908
15  0.19715167 -0.40985599
16 -0.82434803  1.22315494
17  0.30557135 -1.23714424
18  0.30092107  0.05353569
19 -1.07930513  0.13726942
20  1.38780288  1.35679220

simsem documentation built on March 29, 2021, 1:07 a.m.

Related to generate in simsem...