miss: Specifying the missing template to impose on a dataset
In simsem: SIMulated Structural Equation Modeling

Description Usage Arguments Value Author(s) See Also Examples

Specifying the missing template (SimMissing) to impose on a dataset. The template will be used in Monte Carlo simulation such that, in the sim function, datasets are created and imposed by missing values created by this template. See imposeMissing for further details of each argument.

miss(cov = 0, pmMCAR = 0, pmMAR = 0, logit = "", nforms = 0, itemGroups = list(),
     timePoints = 1, twoMethod = 0, prAttr = 0, m = 0,
	   package = "default", convergentCutoff = 0.8, ignoreCols = 0,
     threshold = 0, covAsAux = TRUE, logical = NULL, ...)

`cov`	Column indices of any normally distributed covariates used in the data set.
`pmMCAR`	Decimal percent of missingness to introduce completely at random on all variables.
`pmMAR`	Decimal percent of missingness to introduce using the listed covariates as predictors.
`logit`	The script used for imposing missing values by logistic regression. The script is similar to the specification of regression in `lavaan` such that each line begins with a dependent variable, then '~' is used as regression sign, and the formula of a linear combination of independent variable plus constant, such as y1 ~ 0.5 + 0.2y2. '#' and '!' can be used as a comment (like `lavaan`). For the intercept, users may use 'p()' to specify the average proportion of missing, such as y1 ~ p(0.2) + 0.3y2, which the average missing proportion of y1 is 0.2 and the missing of y1 depends on y2. Users may visualize the missing proportion from the logistic specification by the `plotLogitMiss` function.
`nforms`	The number of forms for planned missing data designs, not including the shared form.
`itemGroups`	List of lists of item groupings for planned missing data forms. Without this, items will be divided into groups sequentially (e.g. 1-3,4-6,7-9,10-12)
`timePoints`	Number of timepoints items were measured over. For longitudinal data, planned missing designs will be implemented within each timepoint.
`twoMethod`	With missing on one variable: vector of (column index, percent missing). Will put a given percent missing on that column in the matrix to simulate a two method planned missing data research design. With missing on two or more variables: list of (column indices, percent missing).
`prAttr`	Probability (or vector of probabilities) of an entire case being removed due to attrition at a given time point. See `imposeMissing` for further details.
`m`	The number of imputations. The default is 0 such that the full information maximum likelihood is used.
`package`	The package to be used in multiple imputation. The default value of this function is `"default"`. For the default option, if `m` is 0, the full information maximum likelihood is used. If `m` is greater than 0, the `"mice"` package is used. The possible inputs are `"default"`, `"Amelia"`, or `"mice"`.
`convergentCutoff`	If the proportion of convergent results across imputations are greater than the specified value (the default is 80%), the analysis on the dataset is considered as convergent. Otherwise, the analysis is considered as nonconvergent. This attribute is applied for multiple imputation only.
`ignoreCols`	The columns not imposed any missing values for any missing data patterns
`threshold`	The threshold of covariates that divide between the area to impose missing and the area not to impose missing. The default threshold is the mean of the covariate.
`covAsAux`	If `TRUE`, the covariate listed in the object will be used as auxiliary variables when putting in the model object. If `FALSE`, the covariate will be included in the analysis.
`logical`	A matrix of logical values (`TRUE/FALSE`). If a value in the dataset is corresponding to the `TRUE` in the logical matrix, the value will be missing.
`...`	Additional arguments used in multiple imputation function.

A missing object that contains missing-data template (SimMissing)

Alexander M. Schoemann (East Carolina University; schoemanna@ecu.edu), Patrick Miller (University of Notre Dame; pmille13@nd.edu), Sunthud Pornprasertmanit (psunthud@gmail.com)

SimMissing The resulting missing object

#Example of imposing 10% MCAR missing in all variables with no imputations (FIML method)
Missing <- miss(pmMCAR=0.1, ignoreCols="group")
summary(Missing)

loading <- matrix(0, 6, 1)
loading[1:6, 1] <- NA
LY <- bind(loading, 0.7)
RPS <- binds(diag(1))
RTE <- binds(diag(6))
CFA.Model <- model(LY = LY, RPS = RPS, RTE = RTE, modelType="CFA")

#Create data
dat <- generate(CFA.Model, n = 20)

#Impose missing
datmiss <- impose(Missing, dat)

#Analyze data
out <- analyze(CFA.Model, datmiss)
summary(out)

#Missing using logistic regression
script <- 'y1 ~ 0.05 + 0.1*y2 + 0.3*y3
	y4 ~ -2 + 0.1*y4
	y5 ~ -0.5'
Missing2 <- miss(logit=script, pmMCAR=0.1, ignoreCols="group")
summary(Missing2)
datmiss2 <- impose(Missing2, dat)

#Missing using logistic regression (2)
script <- 'y1 ~ 0.05 + 0.5*y3
	y2 ~ p(0.2)
	y3 ~ p(0.1) + -1*y1
	y4 ~ p(0.3) + 0.2*y1 + -0.3*y2
	y5 ~ -0.5'
Missing2 <- miss(logit=script)
summary(Missing2)
datmiss2 <- impose(Missing2, dat)

#Example to create simMissing object for 3 forms design at 3 timepoints with 10 imputations
Missing <- miss(nforms=3, timePoints=3, numImps=10)

#Missing template for data analysis with multiple imputation
Missing <- miss(package="mice", m=10, convergentCutoff=0.6)

Loading required package: lavaan
This is lavaan 0.6-3
lavaan is BETA software! Please report any bugs.
 
#################################################################
This is simsem 0.5-14
simsem is BETA software! Please report any bugs.
simsem was first developed at the University of Kansas Center for
Research Methods and Data Analysis, under NSF Grant 1053160.
#################################################################

Attaching package: 'simsem'

The following object is masked from 'package:lavaan':

    inspect

MISSING OBJECT
The method of missing data handling: Maximum Likelihood 
Covariates: none 
Ignored Variables: group 
Proportion of MCAR: 0.1 
Warning message:
In (function (model = NULL, data = NULL, ordered = NULL, sampling.weights = NULL,  :
  lavaan WARNING: the optimizer warns that a solution has NOT been found!
lavaan 0.6-3 did NOT end normally after 10000 iterations
** WARNING ** Estimates below are most likely unreliable

  Optimization method                           NLMINB
  Number of free parameters                         18

  Number of observations                            20
  Number of missing patterns                         7

  Estimator                                         ML
  Model Fit Test Statistic                          NA
  Degrees of freedom                                NA
  P-value                                           NA

Parameter Estimates:

  Information                                 Observed
  Observed information based on                Hessian
  Standard Errors                             Standard

Latent Variables:
                   Estimate  Std.Err  z-value  P(>|z|)
  f1 =~                                               
    y1                0.006       NA                  
    y2                0.002       NA                  
    y3                0.006       NA                  
    y4               29.492       NA                  
    y5               -0.003       NA                  
    y6                0.013       NA                  

Covariances:
                   Estimate  Std.Err  z-value  P(>|z|)
 .y1 ~~                                               
   .y2                0.000                           
   .y3                0.000                           
   .y4                0.000                           
   .y5                0.000                           
   .y6                0.000                           
 .y2 ~~                                               
   .y3                0.000                           
   .y4                0.000                           
   .y5                0.000                           
   .y6                0.000                           
 .y3 ~~                                               
   .y4                0.000                           
   .y5                0.000                           
   .y6                0.000                           
 .y4 ~~                                               
   .y5                0.000                           
   .y6                0.000                           
 .y5 ~~                                               
   .y6                0.000                           

Intercepts:
                   Estimate  Std.Err  z-value  P(>|z|)
    f1                0.000                           
   .y1                0.175       NA                  
   .y2                0.305       NA                  
   .y3               -0.150       NA                  
   .y4                0.063       NA                  
   .y5               -0.158       NA                  
   .y6               -0.021       NA                  

Variances:
                   Estimate  Std.Err  z-value  P(>|z|)
    f1                1.000                           
   .y1                0.500       NA                  
   .y2                0.747       NA                  
   .y3                0.828       NA                  
   .y4             -869.256       NA                  
   .y5                0.488       NA                  
   .y6                0.677       NA                  

MISSING OBJECT
The method of missing data handling: Maximum Likelihood 
Covariates: none 
Ignored Variables: group 
Proportion of MCAR: 0.1 
Logistic-regression MAR:
y1 ~ 0.05 + 0.1*y2 + 0.3*y3
	y4 ~ -2 + 0.1*y4
	y5 ~ -0.5 
MISSING OBJECT
The method of missing data handling: Maximum Likelihood 
Covariates: none 
Ignored Variables: none 
Logistic-regression MAR:
y1 ~ 0.05 + 0.5*y3
	y2 ~ p(0.2)
	y3 ~ p(0.1) + -1*y1
	y4 ~ p(0.3) + 0.2*y1 + -0.3*y2
	y5 ~ -0.5