miss: Specifying the missing template to impose on a dataset In simsem: SIMulated Structural Equation Modeling

Description

Specifying the missing template (SimMissing) to impose on a dataset. The template will be used in Monte Carlo simulation such that, in the sim function, datasets are created and imposed by missing values created by this template. See imposeMissing for further details of each argument.

Usage

 1 2 3 4 miss(cov = 0, pmMCAR = 0, pmMAR = 0, logit = "", nforms = 0, itemGroups = list(), timePoints = 1, twoMethod = 0, prAttr = 0, m = 0, package = "default", convergentCutoff = 0.8, ignoreCols = 0, threshold = 0, covAsAux = TRUE, logical = NULL, ...)

Arguments

 cov Column indices of any normally distributed covariates used in the data set. pmMCAR Decimal percent of missingness to introduce completely at random on all variables. pmMAR Decimal percent of missingness to introduce using the listed covariates as predictors. logit The script used for imposing missing values by logistic regression. The script is similar to the specification of regression in lavaan such that each line begins with a dependent variable, then '~' is used as regression sign, and the formula of a linear combination of independent variable plus constant, such as y1 ~ 0.5 + 0.2*y2. '#' and '!' can be used as a comment (like lavaan). For the intercept, users may use 'p()' to specify the average proportion of missing, such as y1 ~ p(0.2) + 0.3*y2, which the average missing proportion of y1 is 0.2 and the missing of y1 depends on y2. Users may visualize the missing proportion from the logistic specification by the plotLogitMiss function. nforms The number of forms for planned missing data designs, not including the shared form. itemGroups List of lists of item groupings for planned missing data forms. Without this, items will be divided into groups sequentially (e.g. 1-3,4-6,7-9,10-12) timePoints Number of timepoints items were measured over. For longitudinal data, planned missing designs will be implemented within each timepoint. twoMethod With missing on one variable: vector of (column index, percent missing). Will put a given percent missing on that column in the matrix to simulate a two method planned missing data research design. With missing on two or more variables: list of (column indices, percent missing). prAttr Probability (or vector of probabilities) of an entire case being removed due to attrition at a given time point. See imposeMissing for further details. m The number of imputations. The default is 0 such that the full information maximum likelihood is used. package The package to be used in multiple imputation. The default value of this function is "default". For the default option, if m is 0, the full information maximum likelihood is used. If m is greater than 0, the "mice" package is used. The possible inputs are "default", "Amelia", or "mice". convergentCutoff If the proportion of convergent results across imputations are greater than the specified value (the default is 80%), the analysis on the dataset is considered as convergent. Otherwise, the analysis is considered as nonconvergent. This attribute is applied for multiple imputation only. ignoreCols The columns not imposed any missing values for any missing data patterns threshold The threshold of covariates that divide between the area to impose missing and the area not to impose missing. The default threshold is the mean of the covariate. covAsAux If TRUE, the covariate listed in the object will be used as auxiliary variables when putting in the model object. If FALSE, the covariate will be included in the analysis. logical A matrix of logical values (TRUE/FALSE). If a value in the dataset is corresponding to the TRUE in the logical matrix, the value will be missing. ... Additional arguments used in multiple imputation function.

Value

A missing object that contains missing-data template (SimMissing)

Author(s)

Alexander M. Schoemann (East Carolina University; schoemanna@ecu.edu), Patrick Miller (University of Notre Dame; pmille13@nd.edu), Sunthud Pornprasertmanit (psunthud@gmail.com)

• SimMissing The resulting missing object

Examples

 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 #Example of imposing 10% MCAR missing in all variables with no imputations (FIML method) Missing <- miss(pmMCAR=0.1, ignoreCols="group") summary(Missing) loading <- matrix(0, 6, 1) loading[1:6, 1] <- NA LY <- bind(loading, 0.7) RPS <- binds(diag(1)) RTE <- binds(diag(6)) CFA.Model <- model(LY = LY, RPS = RPS, RTE = RTE, modelType="CFA") #Create data dat <- generate(CFA.Model, n = 20) #Impose missing datmiss <- impose(Missing, dat) #Analyze data out <- analyze(CFA.Model, datmiss) summary(out) #Missing using logistic regression script <- 'y1 ~ 0.05 + 0.1*y2 + 0.3*y3 y4 ~ -2 + 0.1*y4 y5 ~ -0.5' Missing2 <- miss(logit=script, pmMCAR=0.1, ignoreCols="group") summary(Missing2) datmiss2 <- impose(Missing2, dat) #Missing using logistic regression (2) script <- 'y1 ~ 0.05 + 0.5*y3 y2 ~ p(0.2) y3 ~ p(0.1) + -1*y1 y4 ~ p(0.3) + 0.2*y1 + -0.3*y2 y5 ~ -0.5' Missing2 <- miss(logit=script) summary(Missing2) datmiss2 <- impose(Missing2, dat) #Example to create simMissing object for 3 forms design at 3 timepoints with 10 imputations Missing <- miss(nforms=3, timePoints=3, numImps=10) #Missing template for data analysis with multiple imputation Missing <- miss(package="mice", m=10, convergentCutoff=0.6)

Example output

This is lavaan 0.6-3
lavaan is BETA software! Please report any bugs.

#################################################################
This is simsem 0.5-14
simsem is BETA software! Please report any bugs.
simsem was first developed at the University of Kansas Center for
Research Methods and Data Analysis, under NSF Grant 1053160.
#################################################################

Attaching package: 'simsem'

The following object is masked from 'package:lavaan':

inspect

MISSING OBJECT
The method of missing data handling: Maximum Likelihood
Covariates: none
Ignored Variables: group
Proportion of MCAR: 0.1
Warning message:
In (function (model = NULL, data = NULL, ordered = NULL, sampling.weights = NULL,  :
lavaan WARNING: the optimizer warns that a solution has NOT been found!
lavaan 0.6-3 did NOT end normally after 10000 iterations
** WARNING ** Estimates below are most likely unreliable

Optimization method                           NLMINB
Number of free parameters                         18

Number of observations                            20
Number of missing patterns                         7

Estimator                                         ML
Model Fit Test Statistic                          NA
Degrees of freedom                                NA
P-value                                           NA

Parameter Estimates:

Information                                 Observed
Observed information based on                Hessian
Standard Errors                             Standard

Latent Variables:
Estimate  Std.Err  z-value  P(>|z|)
f1 =~
y1                0.006       NA
y2                0.002       NA
y3                0.006       NA
y4               29.492       NA
y5               -0.003       NA
y6                0.013       NA

Covariances:
Estimate  Std.Err  z-value  P(>|z|)
.y1 ~~
.y2                0.000
.y3                0.000
.y4                0.000
.y5                0.000
.y6                0.000
.y2 ~~
.y3                0.000
.y4                0.000
.y5                0.000
.y6                0.000
.y3 ~~
.y4                0.000
.y5                0.000
.y6                0.000
.y4 ~~
.y5                0.000
.y6                0.000
.y5 ~~
.y6                0.000

Intercepts:
Estimate  Std.Err  z-value  P(>|z|)
f1                0.000
.y1                0.175       NA
.y2                0.305       NA
.y3               -0.150       NA
.y4                0.063       NA
.y5               -0.158       NA
.y6               -0.021       NA

Variances:
Estimate  Std.Err  z-value  P(>|z|)
f1                1.000
.y1                0.500       NA
.y2                0.747       NA
.y3                0.828       NA
.y4             -869.256       NA
.y5                0.488       NA
.y6                0.677       NA

MISSING OBJECT
The method of missing data handling: Maximum Likelihood
Covariates: none
Ignored Variables: group
Proportion of MCAR: 0.1
Logistic-regression MAR:
y1 ~ 0.05 + 0.1*y2 + 0.3*y3
y4 ~ -2 + 0.1*y4
y5 ~ -0.5
MISSING OBJECT
The method of missing data handling: Maximum Likelihood
Covariates: none
Ignored Variables: none
Logistic-regression MAR:
y1 ~ 0.05 + 0.5*y3
y2 ~ p(0.2)
y3 ~ p(0.1) + -1*y1
y4 ~ p(0.3) + 0.2*y1 + -0.3*y2
y5 ~ -0.5

simsem documentation built on March 29, 2021, 1:07 a.m.