lavOptions: lavaan Options

View source: R/lav_options_default.R

lavOptionsR Documentation

lavaan Options

Description

Show the default options used by the lavaan() function. The options can be changed by passing 'name = value' arguments to the lavaan() function call, where they will be added to the '...' argument.

Usage

lavOptions(x = NULL, default = NULL, mimic = "lavaan")

Arguments

x

Character. A character string holding an option name, or a character string vector holding multiple option names. All option names are converted to lower case.

default

If a single option is specified but not available, this value is returned.

mimic

Not used for now.

Details

This is the full list of options that are accepted by the lavaan() function, organized in several sections:

Model features (always available):

meanstructure:

If TRUE, the means of the observed variables enter the model. If "default", the value is set based on the user-specified model, and/or the values of other arguments.

int.ov.free:

If FALSE, the intercepts of the observed variables are fixed to zero.

int.lv.free:

If FALSE, the intercepts of the latent variables are fixed to zero.

conditional.x:

If TRUE, we set up the model conditional on the exogenous ‘x’ covariates; the model-implied sample statistics only include the non-x variables. If FALSE, the exogenous ‘x’ variables are modeled jointly with the other variables, and the model-implied statistics refect both sets of variables. If "default", the value is set depending on the estimator, and whether or not the model involves categorical endogenous variables.

fixed.x:

If TRUE, the exogenous ‘x’ covariates are considered fixed variables and the means, variances and covariances of these variables are fixed to their sample values. If FALSE, they are considered random, and the means, variances and covariances are free parameters. If "default", the value is set depending on the mimic option.

orthogonal:

If TRUE, all covariances among latent variables are set to zero.

orthogonal.y:

If TRUE, all covariances among endogenous latent variables only are set to zero.

orthogonal.x:

If TRUE, all covariances among exogenous latent variables only are set to zero.

std.lv:

If TRUE, the metric of each latent variable is determined by fixing their (residual) variances to 1.0. If FALSE, the metric of each latent variable is determined by fixing the factor loading of the first indicator to 1.0. If there are multiple groups, std.lv = TRUE and "loadings" is included in the group.equal argument, then only the latent variances of the first group will be fixed to 1.0, while the latent variances of other groups are set free.

effect.coding:

Can be logical or character string. If logical and TRUE, this implies effect.coding = c("loadings", "intercepts"). If logical and FALSE, it is set equal to the empty string. If "loadings" is included, equality constraints are used so that the average of the factor loadings (per latent variable) equals 1. Note that this should not be used together with std.lv = TRUE. If "intercepts" is included, equality constraints are used so that the sum of the intercepts (belonging to the indicators of a single latent variable) equals zero. As a result, the latent mean will be freely estimated and usually equal the average of the means of the involved indicators.

ceq.simple:

Logical. If TRUE, and no other general (equality or inequality) constraints are used in the model, simple equality constraints are represented in the parameter table as duplicated free parameters (instead of extra rows with op = "==").

parameterization:

Currently only used if data is categorical. If "delta", the delta parameterization is used. If "theta", the theta parameterization is used.

correlation:

Only used for (single-level) continuous data. If TRUE, analyze a correlation matrix (instead of a (co)variance matrix). This implies that the residual observed variances are no longer free parameters. Instead, they are set to values to ensure the model-implied variances are unity. This also affects the standard errors. The only available estimators are GLS and WLS, which produce correct standard errors and a correct test statistic under normal and non-normal conditions respectively. Always assuming fixed.x = FALSE and conditional.x = FALSE (for now).

Model features (only available for the lavaan() function):

auto.fix.first:

If TRUE, the factor loading of the first indicator is set to 1.0 for every latent variable.

auto.fix.single:

If TRUE, the residual variance (if included) of an observed indicator is set to zero if it is the only indicator of a latent variable.

auto.var

If TRUE, the (residual) variances of both observed and latent variables are set free.

auto.cov.lv.x:

If TRUE, the covariances of exogenous latent variables are included in the model and set free.

auto.cov.y:

If TRUE, the covariances of dependent variables (both observed and latent) are included in the model and set free.

auto.th:

If TRUE, thresholds for limited dependent variables are included in the model and set free.

auto.delta:

If TRUE, response scaling parameters for limited dependent variables are included in the model and set free.

auto.efa:

If TRUE, the necessary constraints are imposed to make the (unrotated) exploratory factor analysis blocks identifiable: for each block, factor variances are set to 1, factor covariances are constrained to be zero, and factor loadings are constrained to follow an echelon pattern.

Data options:

std.ov:

If TRUE, observed variables are standardized before entering the analysis. By default, these are only the non-exogenous observed variables, unless fixed.x = FALSE. Use this option with caution; it can be used to test if (for example) nonconvergence was due to scaling issues. But this is still a covariance based analysis, in the sense that no constraints are involved (to ensure the model-implied (co)variance matrix has unit variances), and the standard errors still assume that the input was unstandardized. See also the correlation option.

missing:

The default setting is "listwise": all cases with missing values are removed listwise from the data before the analysis starts. This is only valid if the data are missing completely at random (MCAR). Therefore, it may not be the optimal choice, but it can be useful for a first run. If the estimator belongs to the ML family, another option is "ml" (alias: "fiml" or "direct"). This corresponds to the so-called full information maximum likelihood approach (fiml), where we compute the likelihood case by case, using all available data from that case. Note that if the model contains exogenous observed covariates, and fixed.x = TRUE (the default), all cases with any missing values on these covariates will be deleted first. The option "ml.x" (alias: "fiml.x" or "direct.x") is similar to "ml", but does not delete any cases with missing values for the exogenous covariates, even if fixed.x = TRUE. (Note: all lavaan versions < 0.6 used "ml.x" instead of "ml"). If you wish to use multiple imputation, you need to use an external package (eg. mice) to generate imputed datasets, which can then be analyzed using the semList function. The semTools package contains several functions to do this automatically. Another option (with continuous data) is to use "two.stage" or "robust.two.stage". In this approach, we first estimate the sample statistics (mean vector, variance-covariance matrix) using an EM algorithm. Then, we use these estimated sample statistics as input for a regular analysis (as if the data were complete). The standard errors and test statistics are adjusted correctly to reflect the two-step procedure. The "robust.two.stage" option produces standard errors and a test statistic that are robust against non-normality. If (part of) the data is categorical, and the estimator is from the (W)LS family, the only option (besides listwise deletion) is "pairwise". In this three-step approach, missingness is only an issue in the first two steps. In the first step, we compute thresholds (for categorical variables) and means or intercepts (for continuous variables) using univariate information only. In this step, we simply ignore the missing values just like in mean(x, na.rm = TRUE). In the second step, we compute polychoric/polyserial/pearson correlations using (only) two variables at a time. Here we use pairwise deletion: we only keep those observations for which both values are observed (not-missing). And this may change from pair to pair. By default, in the categorical case we use conditional.x = TRUE. Therefore, any cases with missing values on the exogenous covariates will be deleted listwise from the data first. Finally, if the estimator is "PML", the available options are "pairwise", "available.cases" and "doubly.robust". See the PML tutorial on the lavaan website for more information about these approaches.

sampling.weights.normalization:

If "none", the sampling weights (if provided) will not be transformed. If "total", the sampling weights are normalized by dividing by the total sum of the weights, and multiplying again by the total sample size. If "group", the sampling weights are normalized per group: by dividing by the sum of the weights (in each group), and multiplying again by the group size. The default is "total".

samplestats:

Logical. If FALSE, no sample statistics will be computed (and no estimation can take place). This can be useful when only a dummy lavaan object is requested, without any computations. The default is TRUE.

Data summary options:

sample.cov.rescale:

If TRUE, the sample covariance matrix provided by the user is internally rescaled by multiplying it with a factor (N-1)/N. If "default", the value is set depending on the estimator and the likelihood option: it is set to TRUE if maximum likelihood estimation is used and likelihood="normal", and FALSE otherwise.

ridge:

Logical. If TRUE a small constant value will be added the diagonal elements of the covariance (or correlation) matrix before analysis. The value can be set using the ridge.constant option.

ridge.constant:

Numeric. Small constant used for ridging. The default value is 1e-05.

Multiple group options:

group.label:

A character vector. The user can specify which group (or factor) levels need to be selected from the grouping variable, and in which order. If missing, all grouping levels are selected, in the order as they appear in the data.

group.equal:

A vector of character strings. Only used in a multiple group analysis. Can be one or more of the following: "loadings", "composite.loadings", "intercepts", "means", "thresholds", "regressions", "residuals", "residual.covariances", "lv.variances" or "lv.covariances", specifying the pattern of equality constraints across multiple groups.

group.partial:

A vector of character strings containing the labels of the parameters which should be free in all groups (thereby overriding the group.equal argument for some specific parameters).

group.w.free:

Logical. If TRUE, the group frequencies are considered to be free parameters in the model. In this case, a Poisson model is fitted to estimate the group frequencies. If FALSE (the default), the group frequencies are fixed to their observed values.

Estimation options:

estimator:

The estimator to be used. Can be one of the following: "ML" for maximum likelihood, "GLS" for (normal theory) generalized least squares, "WLS" for weighted least squares (sometimes called ADF estimation), "ULS" for unweighted least squares, "DWLS" for diagonally weighted least squares, and "DLS" for distributionally-weighted least squares. These are the main options that affect the estimation. For convenience, the "ML" option can be extended as "MLM", "MLMV", "MLMVS", "MLF", and "MLR". The estimation will still be plain "ML", but now with robust standard errors and a robust (scaled) test statistic. For "MLM", "MLMV", "MLMVS", classic robust standard errors are used (se="robust.sem"); for "MLF", standard errors are based on first-order derivatives (information = "first.order"); for "MLR", ‘Huber-White’ robust standard errors are used (se="robust.huber.white"). In addition, "MLM" will compute a Satorra-Bentler scaled (mean adjusted) test statistic (test="satorra.bentler") , "MLMVS" will compute a mean and variance adjusted test statistic (Satterthwaite style) (test="mean.var.adjusted"), "MLMV" will compute a mean and variance adjusted test statistic (scaled and shifted) (test="scaled.shifted"), and "MLR" will compute a test statistic which is asymptotically equivalent to the Yuan-Bentler T2-star test statistic (test="yuan.bentler.mplus"). Analogously, the estimators "WLSM" and "WLSMV" imply the "DWLS" estimator (not the "WLS" estimator) with robust standard errors and a mean or mean and variance adjusted test statistic. Estimators "ULSM" and "ULSMV" imply the "ULS" estimator with robust standard errors and a mean or mean and variance adjusted test statistic.

likelihood:

Only relevant for ML estimation. If "wishart", the wishart likelihood approach is used. In this approach, the covariance matrix has been divided by N-1, and both standard errors and test statistics are based on N-1. If "normal", the normal likelihood approach is used. Here, the covariance matrix has been divided by N, and both standard errors and test statistics are based on N. If "default", it depends on the mimic option: if mimic="lavaan" or mimic="Mplus", normal likelihood is used; otherwise, wishart likelihood is used.

link:

Not used yet. This is just a placeholder until the MML estimator is back.

information:

If "expected", the expected information matrix is used (to compute the standard errors). If "observed", the observed information matrix is used. If "first.order", the information matrix is based on the outer product of the casewise scores. See also the options "h1.information" and "observed.information" for further control. If "default", the value is set depending on the estimator, the missing argument, and the mimic option. If the argument is a vector with two elements, the first element is used for the computation of the standard errors, while the second element is used for the (robust) test statistic.

h1.information:

If "structured" (the default), the unrestricted (h1) information part of the (expected, first.order or observed if h1 is used) information matrix is based on the structured, or model-implied statistics (model-implied covariance matrix, model-implied mean vector, etc.). If "unstructured", the unrestricted (h1) information part is based on sample-based statistics (observed covariance matrix, observed mean vector, etc.) If the argument is a vector with two elements, the first element is used for the computation of the standard errors, while the second element is used for the (robust) test statistic.

observed.information:

If "hessian", the observed information matrix is based on the hessian of the objective function. If "h1", an approximation is used that is based on the observed information matrix of the unrestricted (h1) model. If the argument is a vector with two elements, the first element is used for the computation of the standard errors, while the second element is used for the (robust) test statistic.

se:

If "standard", conventional standard errors are computed based on inverting the (expected, observed or first.order) information matrix. If "robust.sem", conventional robust standard errors are computed. If "robust.huber.white", standard errors are computed based on the 'mlr' (aka pseudo ML, Huber-White) approach. If "robust", either "robust.sem" or "robust.huber.white" is used depending on the estimator, the mimic option, and whether the data are complete or not. If "boot" or "bootstrap", bootstrap standard errors are computed using standard bootstrapping (unless Bollen-Stine bootstrapping is requested for the test statistic; in this case bootstrap standard errors are computed using model-based bootstrapping). If "none", no standard errors are computed.

test:

Character vector. See the documentation of the lavTest function for a full list. Multiple names of test statistics can be provided. If "default", the value depends on the values of other arguments. See also the lavTest function to extract (alternative) test statistics from a fitted lavaan object.

scaled.test:

Character. Choose the test statistic that will be scaled (if a scaled test statistic is requested). The default is "standard", but it could also be (for example) "Browne.residual.nt".

gamma.n.minus.one

Logical. If TRUE, we divide the Gamma matrix by N-1 (instead of the default N).

gamma.unbiased

Logical. If TRUE, we compute an unbiased version for the Gamma matrix. Only available for single-level complete data and when conditional.x = FALSE and fixed.x = FALSE (for now).

bootstrap:

Number of bootstrap draws, if bootstrapping is used.

do.fit:

If FALSE, the model is not fit, and the current starting values of the model parameters are preserved.

Optimization options:

control:

A list containing control parameters passed to the external optimizer. By default, lavaan uses "nlminb". See the manpage of nlminb for an overview of the control parameters. If another (external) optimizer is selected, see the manpage for that optimizer to see the possible control parameters.

optim.method:

Character. The optimizer that should be used. For unconstrained optimization or models with only linear equality constraints (i.e., the model syntax does not include any "==", ">" or "<" operators), the available options are "nlminb" (the default), "BFGS", "L-BFGS-B". These are all quasi-newton methods. A basic implementation of Gauss-Newton is also available (optim.method = "GN"). The latter is the default when estimator = "DLS". For constrained optimization, the only available option is "nlminb.constr", which uses an augmented Lagrangian minimization algorithm.

optim.force.converged:

Logical. If TRUE, pretend the model has converged, no matter what.

optim.dx.tol

Numeric. Tolerance used for checking if the elements of the (unscaled) gradient are all zero (in absolute value). The default value is 0.001.

optim.gn.tol.x:

Numeric. Only used when optim.method = "GN". Optimization stops when the root mean square of the difference between the old and new parameter values are smaller than this tolerance value. Default is 1e-05 for DLS estimation and 1e-07 otherwise.

optim.gn.iter.max:

Integer. Only used when optim.method = "GN". The maximum number of GN iterations. The default is 200.

bounds:

Only used if optim.method = "nlminb". If logical: FALSE implies no bounds are imposed on the parameters. If TRUE, this implies bounds = "wide". If character, possible options are "none" (the default), "standard", "wide", "pos.var", "pos.ov.var", and "pos.lv.var". If bounds = "pos.ov.var", the observed variances are forced to be nonnegative. If bounds = "pos.lv.var", the latent variances are forced to be nonnegative. If bounds = "pos.var", both observed and latent variances are forced to be nonnegative. If bounds = "standard", lower and upper bounds are computed for observed and latent variances, and factor loadings. If bounds = "wide", lower and upper bounds are computed for observed and latent variances, and factor loadings; but the range of the bounds is enlarged (allowing again for slightly negative variances).

optim.bounds:

List. This can be used instead of the bounds argument to allow more control. Possible elements of the list are lower, upper, lower.factor and upper.factor. All of these accept a vector. The lower and upper elements indicate for which type of parameters bounds should be computed. Possible choice are "ov.var", "lv.var", "loadings" and "covariances". The lower.factor and upper.factor elements should have the same length as the lower and upper elements respectively. They indicate the factor by which the range of the bounds should be enlarged (for example, 1.1 or 1.2; the default is 1.0). Other elements are min.reliability.marker which sets the lower bound for the reliability of the marker indicator (if any) of each factor (default is 0.1). Finally, the min.var.lv.endo element indicates the lower bound of the variance of any endogenous latent variance (default is 0.0).

Parallelization options (currently only used for bootstrapping):

parallel

The type of parallel operation to be used (if any). If missing, the default is "no".

ncpus

Integer: number of processes to be used in parallel operation: typically one would chose this to the number of available CPUs. By By default this is the number of cores (as detected by parallel::detectCores()) minus one.

cl

An optional parallel or snow cluster for use if parallel = "snow". If not supplied, a cluster on the local machine is created for the duration of the bootstrapLavaan or bootstrapLRT call.

iseed

An integer to set the seed. Or NULL if no reproducible results are needed. This works for both serial (non-parallel) and parallel settings. Internally, RNGkind() is set to "L'Ecuyer-CMRG" if parallel = "multicore". If parallel = "snow" (under windows), parallel::clusterSetRNGStream() is called which automatically switches to "L'Ecuyer-CMRG". When iseed is not NULL, .Random.seed (if it exists) in the global environment is left untouched.

Categorical estimation options:

zero.add:

A numeric vector containing two values. These values affect the calculation of polychoric correlations when some frequencies in the bivariate table are zero. The first value only applies for 2x2 tables. The second value for larger tables. This value is added to the zero frequency in the bivariate table. If "default", the value is set depending on the "mimic" option. By default, lavaan uses zero.add = c(0.5. 0.0).

zero.keep.margins:

Logical. This argument only affects the computation of polychoric correlations for 2x2 tables with an empty cell, and where a value is added to the empty cell. If TRUE, the other values of the frequency table are adjusted so that all margins are unaffected. If "default", the value is set depending on the "mimic". The default is TRUE.

zero.cell.warn:

Logical. Only used if some observed endogenous variables are categorical. If TRUE, give a warning if one or more cells of a bivariate frequency table are empty.

allow.empty.cell:

Logical. If TRUE, ignore situations where an ordinal variable has fewer categories than expected, or where a category is empty in a specific group.

Starting values options:

start:

If it is a character string, the two options are currently "simple" and "Mplus". In the first case, all parameter values are set to zero, except the factor loadings and (residual) variances, which are set to one. When start is "Mplus", the factor loadings are estimated using the fabin3 estimator (tsls) per factor. The residual variances of observed variables are set tot half the observed variance, and all other (residual) variances are set to 0.05. The remaining parameters (regression coefficients, covariances) are set to zero. If start is a fitted object of class lavaan, the estimated values of the corresponding parameters will be extracted. If it is a parameter table, for example the output of the paramaterEstimates() function, the values of the est or start or ustart column (whichever is found first) will be extracted.

rstarts:

Integer. The number of refits that lavaan should try with random starting values. Random starting values are computed by drawing random numbers from a uniform distribution. Correlations are drawn from the interval [-0.5, +0.5] and then converted to covariances. Lower and upper bounds for (residual) variances are computed just like the standard bounds in bounded estimation. Random starting values are not computed for regression coefficients (which are always zero) and factor loadings of higher-order constructs (which are always unity). From all the runs that converged, the final solution is the one that resulted in the smallest value for the discrepancy function.

Check options:

check.start:

Logical. If TRUE, the starting values are checked for possibly inconsistent values (for example values implying correlations larger than one). If needed, a warning is given.

check.gradient:

Logical. If TRUE, and the model converged, a warning is given if the optimizer decided that a (local) solution has been found, while not all elements of the (unscaled) gradient (as seen by the optimizer) are (near) zero, as they should be (the tolerance used is 0.001).

check.post:

Logical. If TRUE, and the model converged, a check is performed after (post) fitting, to verify if the solution is admissible. This implies that all variances are non-negative, and all the model-implied covariance matrices are positive (semi-)definite. For the latter test, we tolerate a tiny negative eigenvalue that is smaller than .Machine$double.eps^(3/4), treating it as being zero.

check.vcov:

Logical. If TRUE, and the model converged, we check if the variance-covariance matrix of the free parameters is positive definite. We take into account possible equality and acitive inequality constraints. If needed, a warning is given.

check.lv.names:

Logical. If TRUE, and latent variables are defined in the model, lavaan will stop with an error message if a latent variable name also occurs in the data (implying it is also an observed variable).

Verbosity options:

verbose:

If TRUE, show what lavaan is doing. During estimation, the function value is printed out during each iteration.

warn:

If FALSE, suppress all lavaan-specific warning messages.

debug:

If TRUE, debugging information is printed out.

Miscellaneous:

model.type:

Set the model type: possible values are "cfa", "sem" or "growth". This may affect how starting values are computed, and may be used to alter the terminology used in the summary output, or the layout of path diagrams that are based on a fitted lavaan object.

mimic:

If "Mplus", an attempt is made to mimic the Mplus program. If "EQS", an attempt is made to mimic the EQS program. If "default", the value is (currently) set to to "lavaan", which is very close to "Mplus".

representation:

If "LISREL" the classical LISREL matrix representation is used to represent the model (using the all-y variant). No other options are available (for now).

implied:

Logical. If TRUE, compute the model-implied statistics, and store them in the implied slot.

h1:

Logical. If TRUE, compute the unrestricted model and store the unrestricted summary statistics (and perhaps a loglikelihood) in the h1 slot.

baseline:

Logical. If TRUE, compute a baseline model (currently always the independence model, assuming all variables are uncorrelated) and store the results in the baseline slot.

baseline.conditional.x.free.slopes:

Logical. If TRUE, and conditional.x = TRUE, the (default) baseline model will allow the slopestructure to be unrestricted.

store.vcov

Logical. If TRUE, and se= is not set to "none", store the full variance-covariance matrix of the model parameters in the vcov slot of the fitted lavaan object.

parser

Character. If "new" (the default), the new parser is used to parse the model syntax. If "old", the original (pre 0.6-18) parser is used.

See Also

lavaan

Examples

lavOptions()
lavOptions("std.lv")
lavOptions(c("std.lv", "orthogonal"))

lavaan documentation built on Sept. 27, 2024, 9:07 a.m.