regsim: Fit and appraise regression models using simulated data
In mcbeem/regsim: Fit and Appraise Regression Models Using Simulated Data

Description Usage Arguments Value Examples

This function provides a quick and convenient means for simulating data from a path diagram or SEM, fitting a regression model to the simulated data, and appraising the model's performance with respect to a particular target parameter and its standard error. The function calculates the bias, coverage rate, RMSE, and analytical and empirical standard errors.

1 2	regsim(reps, n, true.model, fit.model, targetparm, targetval, interval = 0.95, standardized = FALSE, ...)

`reps`	the number of repetitions to perform
`n`	the sample size of the generated data
`true.model`	character; the population model used to generate the data specified in `lavaan` syntax. See example.
`fit.model`	character; the model to the fitted to the data in `lm()` model formula syntax
`targetparm`	character; the focal predictor. Must be included in the model formula passed to `true.model`
`targetval`	the true value of the path coefficient relating the focal predictor to the response variable
`interval`	the confidence interval width, defaults to 0.95.
`standardized`	logical; should the data be generated on a standardized scale? defaults to FALSE
`...`	additional arguments passed to lavaan's `simulateData` function. For example, one could invoke the `kurtosis` or `skewness` arguments to generate non-normal data

A list containing the following:

`true.model`	the true model from which data are simulated
`fit.model`	the regression model that is fit to the data
`b`	a vector of target parameter estimates across the repetitions
`data`	a data frame containing the simulated data from the first repetition
`true.DAG`	the true.model expressed as an object of class dagitty
`targetval`	the true value of the target parameter, used to calculate bias, RMSE, and coverage
`expected.b`	the mean value of the target parameter across the repetitions
`bias`	the mean difference between the estimated and target values of the parameter across repetitions
`analytic.SE`	the mean of the analytic standard errors across repetitions
`empirical.SE`	the standard deviation of the parameter estimates
`analytic.CI`	the analytic confidence interval boundaries, calculated as the mean of the lower and upper boundaries across the repetitions
`empirical.CI`	the empirical confidence interval boundaries across the repetitions
`coverage`	the proportion of repetitions in which the confidence interval captured the specified target value of the parameter. This should appoximately equal the specified confidence level
`RMSE`	the root mean squared error
`adjustment.sets`	the sets of covariates that must be included in the fitted model to yield an unbiased and consistent estimate of the effect of the target parameter on the response variable in fit.model

#  this is a DAG where Z is a confounder of the X -> Y relationship
true.model <- 'X ~ .5*Z
               Y ~ .5*X + .5*Z'

# misspecified model omitting Z
fit.model <- 'Y ~ X'

result <- regsim(reps=1000, n=100, true.model, fit.model, targetparm="X",
                 targetval=.5)
result

# visualize the model
plot(result, type="model")
# visualize the simulated data
plot(result, type="data")
# visualize the results
plot(result, type="perf")
plot(result, type="compareSE")