specifyModel: Specify a Structural Equation Model
In sem3: Structural Equation Models

Description Usage Arguments Details Value Author(s) See Also Examples

Create the RAM specification of a structural equation model.

specifyModel(file="", exog.variances=FALSE, endog.variances=TRUE, covs, 
	suffix="", quiet=FALSE)

specifyEquations(file="", ...)

cfa(file="", covs=paste(factors, collapse=","), reference.indicators=FALSE, ...)

multigroupModel(..., groups=names(models), allEqual=FALSE)

classifyVariables(model)

removeRedundantPaths(model, warn=TRUE)
## S3 method for class 'semmod'
combineModels(..., warn=TRUE)
## S3 method for class 'semmod'
update(object, file = "", ...)
## S3 method for class 'semmod'
edit(name, ...)

## S3 method for class 'semmod'
print(x, ...)
## S3 method for class 'semmodList'
print(x, ...)

`file`	The (quoted) file from which to read the model specification, including the path to the file if it is not in the current directory. If `""` (the default), then the specification is read from the standard input stream, and is terminated by a blank line.
`exog.variances`	If `TRUE` (the default is `FALSE`), free variance parameters are added for the exogenous variables that lack them.
`endog.variances`	If `TRUE` (the default), free error-variance parameters are added for the endogenous variables that lack them.
`covs`	optional: a character vector of one or more elements, with each element giving a string of variable names, separated by commas. Variances and covariances among all variables in each such string are added to the model. For confirmatory factor analysis models specified via `cfa`, `covs` defaults to all of the factors in the model, thus specifying all covariances among these factors.
`suffix`	a character string (defaulting to an empty string) to be appended to each parameter name; this can be convenient for specifying multiple-group models.
`reference.indicators`	if `FALSE`, the default, variances of factors are set to 1 by `cfa`; if `TRUE`, variances of factors are free parameters to estimate from the data, and instead the first factor loading for each factor is set to 1 to identify the model.
`quiet`	if `FALSE`, the default, then the number of input lines is reported.
`x, model, object, name`	An object of class `semmod` or `semmodList`, as produced by `specifyModel` or `multigroupModel`.
`warn`	print a warning if redundant paths are detected.
`...`	For `multigroupModel`, one or more optionally named arguments each of which is a `semmod` object produced, e.g., by `specifyModel`, `specifyEquations`, or `cfa`; if only one such model is given, then it will be used for all groups defined by the `groups` argument. If parameters have the same name in different groups, then they will be constrained to be equal. For `specifyEquations` and `cfa`, arguments (such as `covs`, in the case of `specifyEquations`) to be passed to `specifyModel`; for `combineModels`, `sem` objects; ignored in the `update` and `print` methods.
`groups`	a character vector of names for the groups in a multigroup model; taken by default from the names of the `...` arguments.
`allEqual`	if `FALSE` (the default), then if only one model object is given for a multigroup model, all corresponding parameters in the groups will be distinct; if `TRUE`, all corresponding parameters will be constrained to be equal.

The principal functions for model specification are specifyModel, to specify a model in RAM (path) format via single- and double-headed arrows; specifyEquations, to specify a model in equation format, which is then translated by the function into RAM format; and cfa, for compact specification of simple confirmatory factor analysis models.

specifyModel:

Each line of the RAM specification for specifyModel consists of three (unquoted) entries, separated by commas:

1. Arrow specification:: This is a simple formula, of the form A -> B or, equivalently, B <- A for a regression coefficient (i.e., a single-headed or directional arrow); A <-> A for a variance or A <-> B for a covariance (i.e., a double-headed or bidirectional arrow). Here, A and B are variable names in the model. If a name does not correspond to an observed variable, then it is assumed to be a latent variable. Spaces can appear freely in an arrow specification, and there can be any number of hyphens in the arrows, including zero: Thus, e.g., A->B, A --> B, and A>B are all legitimate and equivalent.
2. Parameter name:: The name of the regression coefficient, variance, or covariance specified by the arrow. Assigning the same name to two or more arrows results in an equality constraint. Specifying the parameter name as NA produces a fixed parameter.
3. Value:: start value for a free parameter or value of a fixed parameter. If given as NA (or simply omitted), sem will compute the start value.

Lines may end in a comment following \#.

specifyEquations:

For specifyEquations, each input line is either a regression equation or the specification of a variance or covariance. Regression equations are of the form

y = par1*x1 + par2*x2 + ... + park*xk

where y and the xs are variables in the model (either observed or latent), and the pars are parameters. If a parameter is given as a numeric value (e.g., 1) then it is treated as fixed. Note that no “error” variable is included in the equation; “error variances” are specified via either the covs argument, via V(y) = par (see immediately below), or are added automatically to the model when, as by default, endog.variances=TRUE.

Variances are specified in the form V(var) = par and covariances in the form C(var1, var2) = par, where the vars are variables (observed or unobserved) in the model. The symbols V and C may be in either lower- or upper-case. If par is a numeric value (e.g., 1) then it is treated as fixed. In conformity with the RAM model, a variance or covariance for an endogenous variable in the model is an “error” variance or covariance.

To set a start value for a free parameter, enclose the numeric start value in parentheses after the parameter name, as parameter(value).

cfa:

For cfa, each input line includes the names of the variables, separated by commas, that load on the corresponding factor; the name of the factor is given optionally at the beginning of the line, followed by a colon. If necessary, the variables that load on a factor may be continued across two or more input lines; in this case, each such line but the last must end in a comma. A variable may load on more than one factor (as long as the resulting model is identified, of course), but each factor may appear in only one input line (or set of input lines, if the variable list is continued onto the next line). If the argument reference.indicators=FALSE, the default, cfa will fix the variance of each factor to 1, and by default include covariances (i.e., correlations) among all pairs of factors. Alternatively, if reference.indicators=TRUE, then the factor variances are free parameters to be estimated from the data, and the first loading for each factor is set to 1 to identify the model. These two approaches produce equivalent models, with the same fit to the data, but alternative parametrizations. Specifying the argument covs=NULL implicitly fixes the factor intercorrelations to 0.

See sem and the examples for further details on model specification.

Other Functions:

classifyVariables classifies the variables in a model as endogenous or exogenous.

combineModels and removeRedundantPaths take semmod objects as arguments and do what their names imply.

The file input argument to the update method for semmod objects, which by default comes from standard input, is a set of update directives, one per line. There are five kinds of directives. In each case the directive begins with the directive name, followed by one or more fields separated by commas.

1. delete:: Remove a path from the model. Example: delete, RSES -> FGenAsp
2. add:: Add a path to the model. Example (the NA for the start value is optional): add, RSES -> FGenAsp, gam14, NA
3. replace:: Replace every occurrence of the first string with the second in the variables and parameters of the model. This directive may be used, for example, to change one variable to another or to rename a parameter. Example: replace, gam, gamma, substitutes the string "gamma" for "gam" wherever the latter appears, presumably in parameter names.
4. fix:: Fix a parameter that was formerly free. Example: fix, RGenAsp -> REdAsp, 1
5. free:: Free a parameter that was formerly fixed. Example (the NA for the start value is optional): free, RGenAsp -> ROccAsp, lam11, NA

The edit method for semmod objects opens the model in the R editor.

specifyModel, specifyEquations, cfa, removeRedundantPaths, combineModels, update, and edit return an object of class semmod, suitable as input for sem.

multigroupModel returns an object of class semmodList, also suitable as input for sem.

classifyVariables returns a list with two character vectors: endogenous, containing the names of endogenous variables in the model; and exogenous, containing the names of exogenous variables.

John Fox jfox@mcmaster.ca and Jarrett Byrnes

sem

# Note: These examples can't be run via example() because the default file
#  argument of specifyModel() requires that the model specification be entered
#  at the command prompt. The examples can be copied and run in the R console,
#  however.

    ## Not run: 
model.dhp <- specifyModel()
    RParAsp  -> RGenAsp, gam11,  NA
    RIQ      -> RGenAsp, gam12,  NA
    RSES     -> RGenAsp, gam13,  NA
    FSES     -> RGenAsp, gam14,  NA
    RSES     -> FGenAsp, gam23,  NA
    FSES     -> FGenAsp, gam24,  NA
    FIQ      -> FGenAsp, gam25,  NA
    FParAsp  -> FGenAsp, gam26,  NA
    FGenAsp  -> RGenAsp, beta12, NA
    RGenAsp  -> FGenAsp, beta21, NA
    RGenAsp  -> ROccAsp,  NA,     1
    RGenAsp  -> REdAsp,  lam21,  NA
    FGenAsp  -> FOccAsp,  NA,     1
    FGenAsp  -> FEdAsp,  lam42,  NA
    RGenAsp <-> RGenAsp, ps11,   NA
    FGenAsp <-> FGenAsp, ps22,   NA
    RGenAsp <-> FGenAsp, ps12,   NA
    ROccAsp <-> ROccAsp, theta1, NA
    REdAsp  <-> REdAsp,  theta2, NA
    FOccAsp <-> FOccAsp, theta3, NA
    FEdAsp  <-> FEdAsp,  theta4, NA
    
model.dhp
    
# an equivalent specification, allowing specifyModel() to generate
#  variance parameters for endogenous variables (and suppressing
#  the unnecessary trailing NAs):
 
model.dhp <- specifyModel()
RParAsp  -> RGenAsp, gam11
RIQ      -> RGenAsp, gam12
RSES     -> RGenAsp, gam13
FSES     -> RGenAsp, gam14
RSES     -> FGenAsp, gam23
FSES     -> FGenAsp, gam24
FIQ      -> FGenAsp, gam25
FParAsp  -> FGenAsp, gam26
FGenAsp  -> RGenAsp, beta12
RGenAsp  -> FGenAsp, beta21
RGenAsp  -> ROccAsp,  NA,     1
RGenAsp  -> REdAsp,  lam21
FGenAsp  -> FOccAsp,  NA,     1
FGenAsp  -> FEdAsp,  lam42
RGenAsp <-> FGenAsp, ps12

model.dhp

# Another equivalent specification, telling specifyModel to add paths for 
#   variances and covariance of RGenAsp and FGenAsp:
 
model.dhp <- specifyModel(covs="RGenAsp, FGenAsp")
RParAsp  -> RGenAsp, gam11
RIQ      -> RGenAsp, gam12
RSES     -> RGenAsp, gam13
FSES     -> RGenAsp, gam14
RSES     -> FGenAsp, gam23
FSES     -> FGenAsp, gam24
FIQ      -> FGenAsp, gam25
FParAsp  -> FGenAsp, gam26
FGenAsp  -> RGenAsp, beta12
RGenAsp  -> FGenAsp, beta21
RGenAsp  -> ROccAsp,  NA,     1
RGenAsp  -> REdAsp,  lam21
FGenAsp  -> FOccAsp,  NA,     1
FGenAsp  -> FEdAsp,  lam42

model.dhp

# The same model in equation format:

model.dhp.1 <- specifyEquations(covs="RGenAsp, FGenAsp")
RGenAsp = gam11*RParAsp + gam12*RIQ + gam13*RSES + gam14*FSES + beta12*FGenAsp
FGenAsp = gam23*RSES + gam24*FSES + gam25*FIQ + gam26*FParAsp + beta21*RGenAsp
ROccAsp = 1*RGenAsp
REdAsp = lam21(1)*RGenAsp  # to illustrate setting start values
FOccAsp = 1*FGenAsp
FEdAsp = lam42(1)*FGenAsp

model.dhp

classifyVariables(model.dhp)

# updating the model to impose equality constraints
#  and to rename the latent variables and gamma parameters

model.dhp.eq <- update(model.dhp)
delete, RSES -> FGenAsp
delete, FSES -> FGenAsp
delete, FIQ  -> FGenAsp
delete, FParAsp -> FGenAs
delete, RGenAsp  -> FGenAsp
add, RSES     -> FGenAsp, gam14,  NA
add, FSES     -> FGenAsp, gam13,  NA
add, FIQ      -> FGenAsp, gam12,  NA
add, FParAsp  -> FGenAsp, gam26,  NA
add, RGenAsp  -> FGenAsp, beta12, NA
replace, gam, gamma
replace, Gen, General

model.dhp.eq

# A three-factor CFA model for the Thurstone mental-tests data, specified three equivalent ways:

R.thur <- readMoments(diag=FALSE, names=c('Sentences','Vocabulary',
                                          'Sent.Completion','First.Letters','4.Letter.Words','Suffixes',
                                          'Letter.Series','Pedigrees', 'Letter.Group'))
.828                                              
.776   .779                                        
.439   .493    .46                                 
.432   .464    .425   .674                           
.447   .489    .443   .59    .541                    
.447   .432    .401   .381    .402   .288              
.541   .537    .534   .35    .367   .32   .555        
.38   .358    .359   .424    .446   .325   .598   .452

	#  (1a) in CFA format:

mod.cfa.thur.c <- cfa()
FA: Sentences, Vocabulary, Sent.Completion
FB: First.Letters, 4.Letter.Words, Suffixes
FC: Letter.Series, Pedigrees, Letter.Group

cfa.thur.c <- sem(mod.cfa.thur.c, R.thur, 213)
summary(cfa.thur.c)

	#  (1b) in CFA format, using reference indicators:
	
mod.cfa.thur.r <- cfa(reference.indicators=TRUE)
FA: Sentences, Vocabulary, Sent.Completion
FB: First.Letters, 4.Letter.Words, Suffixes
FC: Letter.Series, Pedigrees, Letter.Group

cfa.thur.r <- sem(mod.cfa.thur.r, R.thur, 213)
summary(cfa.thur.r)

	#  (2) in equation format:

mod.cfa.thur.e <- specifyEquations(covs="F1, F2, F3")
Sentences = lam11*F1
Vocabulary = lam21*F1
Sent.Completion = lam31*F1
First.Letters = lam42*F2
4.Letter.Words = lam52*F2
Suffixes = lam62*F2
Letter.Series = lam73*F3
Pedigrees = lam83*F3
Letter.Group = lam93*F3
V(F1) = 1
V(F2) = 1
V(F3) = 1

cfa.thur.e <- sem(mod.cfa.thur.e, R.thur, 213)
summary(cfa.thur.e)

	#  (3) in path format:

mod.cfa.thur.p <- specifyModel(covs="F1, F2, F3")
F1 -> Sentences,                      lam11
F1 -> Vocabulary,                     lam21
F1 -> Sent.Completion,                lam31
F2 -> First.Letters,                  lam41
F2 -> 4.Letter.Words,                 lam52
F2 -> Suffixes,                       lam62
F3 -> Letter.Series,                  lam73
F3 -> Pedigrees,                      lam83
F3 -> Letter.Group,                   lam93
F1 <-> F1,                            NA,     1
F2 <-> F2,                            NA,     1
F3 <-> F3,                            NA,     1

cfa.thur.p <- sem(mod.cfa.thur.p, R.thur, 213)
summary(cfa.thur.p)

# a multigroup CFA model fit to the Holzinger-Swineford
#   mental-tests data (from the MBESS package)

library(MBESS)
data(HS.data)

mod.hs <- cfa()
spatial: visual, cubes, paper, flags
verbal: general, paragrap, sentence, wordc, wordm
memory: wordr, numberr, figurer, object, numberf, figurew
math: deduct, numeric, problemr, series, arithmet

mod.mg <- multigroupModel(mod.hs, groups=c("Female", "Male")) 

sem.mg <- sem(mod.mg, data=HS.data, group="Gender",
              formula = ~ visual + cubes + paper + flags +
              general + paragrap + sentence + wordc + wordm +
              wordr + numberr + figurer + object + numberf + figurew +
              deduct + numeric + problemr + series + arithmet
              )
summary(sem.mg)

	# with cross-group equality constraints:
	
mod.mg.eq <- multigroupModel(mod.hs, groups=c("Female", "Male"), allEqual=TRUE)

sem.mg.eq <- sem(mod.mg.eq, data=HS.data, group="Gender",
              formula = ~ visual + cubes + paper + flags +
                general + paragrap + sentence + wordc + wordm +
                wordr + numberr + figurer + object + numberf + figurew +
                deduct + numeric + problemr + series + arithmet
              )
summary(sem.mg.eq)
    
## End(Not run)