knitr::opts_chunk$set( collapse = TRUE, comment = "#>", echo = FALSE, error = FALSE, message = FALSE, warning = FALSE ) # Install locally # devtools::install_local( R'(C:\Users\James.Thorson\Desktop\Git\dsem)', force=TRUE ) # Build and PDF # setwd(R'(C:\Users\James.Thorson\Desktop\Git\dsem)'); devtools::build_rmd("vignettes/model-description.Rmd"); rmarkdown::render( "vignettes/model-description.Rmd", rmarkdown::pdf_document()) # # Recommended workflow: # * Open in Rstudio and knit using button there
\newcommand{\B}{\mathbf}
\newcommand{\PL}{\mathrm}
\newcommand{\s}{\boldsymbol{s}}
dsem involves specifying a dynamic structural equation model (DSEM). This DSEM be viewed either:
We introduce DSEM first from the perspective of a software user (i.e., the interface) and then from the perspective of a statistician (i.e., the equations and their interpretation).
To specify a DSEM, the user uses arrow-and-lag notation, based on arrow notation derived from package sem
. For example, to specify a first-order autoregressive process with variable $x$ this involves:
x -> x, 1, ar1 x <-> x, 0, sd
This then estimates a single parameter representing first-order autoregression (represented with a one-headed arrow), as well as the Cholesky decomposition of the exogenous covariance of of model variables (specified with two-headed arrows). See ?make_dsem_ram
Details section for more details about syntax.
If there were four time-intervals ($T=4$) this would then result in the path matrix:
$$ \B P_{\PL{joint}} = \begin{pmatrix} 0 & 0 & 0 & 0 \ \rho & 0 & 0 & 0 \ 0 & \rho & 0 & 0 \ 0 & 0 & \rho & 0 \end{pmatrix} $$ This joint path matrix then represents the partial effect of each variable and time (column) on each other variable and time (row).
DSEM interactions can be as complicated or simple as desired, and can include:
parameter_name
is provided as NA
and the starting value that follows is the fixed value;parameter_name
is provided for multiple rows of the text file.The user also specifies a distribution for measurement errors for each variable using arguement family
, and whether each time-series starts from its stationary distribution or from some non-equilibrium initial condition using argument estimate_delta0
. If the latter is specified, then variables will tend to converge back on the stationary distribution at a rate that is determined by estimated parameters.
The DSEM defines a generalized linear mixed model (GLMM) for a $T \times J$ matrix $\mathbf{Y}$, where $y_{tj}$ is the measurement in time $t$ for variable $j$. This measurement matrix can include missing values $y_{tj} = \PL{NA}$, and it will estimate a $T \times J$ matrix of latent states $\mathbf{X}$ for all modeled times and variables.
DSEM includes multiple distribution for measurement errors. For example, if the user specifies family[j] = "fixed"
then:
$$
y_{tj} = x_{tj} + d_{tj}
$$
for all years. Alternatively, if the user specifies family[j] = "normal"
then:
$$
y_{tj} \sim \PL{Normal}( x_{tj} + d_{tj}, {\sigma_j}^2)
$$
and ${\sigma_j}^2$ is then included as an estimated parameter. These expressions include the $T \times J$ matrix $\B D$ representing the ongoing impact of initial conditions $d_{tj}$ for each variable and year, as explained in detail below.
The specified DSEM then results in Gaussian Markov random field for latent states:
$$ \PL{vec}(\B X) \sim \PL{GMRF}(\B{0, Q}{\PL{joint}}) $$ where $\B Q{\PL{joint}}$ is a $TJ \times TJ$ precision matrix such that $\B {Q_{\PL{joint}}}^{-1}$ is the estimated covariance among latent states. This joint precision is itself constructed from a joint path matrix $\B P_{\PL{joint}}$ and a joint matrix of exogenous covariance $\B G_{\PL{joint}}$:
$$ \B Q_{\PL{joint}} = ({\mathbf{I - P}{\PL{joint}}}^T) \B G{\PL{joint}}^{-1} \B G_{\PL{joint}}^{-T} (\mathbf{I - P_{\PL{joint}}}) $$
The joint path matrix is itself constructed by summing across lagged and simultaneous effects. Say we specify a model with $K=2$ one-headed arrows. For each one-headed arrow, we define a $J \times J$ path matrix $\B P_k$ and a lag matrix $\B L_k$. For example, in a model with $J=3$ variables $(A,B,C)$ and $T=4$ times, and specifying $K=2$ one-headed arrows:
A -> B, 0, b_AB B -> C, 1, b_BC
this then results two path matrices:
$$ \B P_1 = \begin{pmatrix} 0 & 0 & 0 \ b_{AB} & 0 & 0 \ 0 & 0 & 0 \ \end{pmatrix} $$ and $$ \B P_2 = \begin{pmatrix} 0 & 0 & 0 \ 0 & 0 & 0 \ 0 & b_{BC} & 0 \ \end{pmatrix} $$ with corresponding lag matrices
$$ \B L_{1} = \begin{pmatrix} 1 & 0 & 0 & 0 \ 0 & 1 & 0 & 0 \ 0 & 0 & 1 & 0 \ 0 & 0 & 0 & 1 \end{pmatrix} $$ and $$ \B L_{2} = \begin{pmatrix} 0 & 0 & 0 & 0 \ 1 & 0 & 0 & 0 \ 0 & 1 & 0 & 0 \ 0 & 0 & 1 & 0 \end{pmatrix} $$ We then sum across the Kronecker product of these components to obtain the joint path matrix:
$$ \B P_{\PL{joint}} = \sum_{k=1}^{K}(\B L_k \otimes \B P_k) $$ where $\otimes$ is the Kronecker product. Similarly, the exogenous covariance is similar constructed from a Kronecker product, although we assume that all covariance is simultaneous (i.e., no lags are allowed for two-headed arrows):
$$ \B G_{\PL{joint}} = \B I \otimes \B G $$
These matrices are define a simultaneous equation model:
$$ \PL{vec}(\B X) = \B P_{\PL{joint}} \PL{vec}(\B X) + \B \epsilon \ \B \epsilon \sim \mathrm{MVN}( \B 0, \B {G_{\PL{joint}}}^T \B G_{\PL{joint}} ) $$ and the precision matrix can be derived from this simultaneous equation.
Imagine we have some exogenous intervention that caused a $T \times J$ matrix of changes $\B C$. The total effect of this exogenous intervention would then be $(\mathbf{I - P}{\PL{joint}})^{-1} \PL{vec}(\B C)$, and we can calculate any total effect using this matrix inverse $(\mathbf{I - P}{\PL{joint}})^{-1}$ (called the "Leontief matrix"). To see this, consider that the first-order effect of change $\B C$ is $\B P_{\PL{joint}} \PL{vec}(\B C)$, but this response then in turn causes a second-order effect $\B {P_{\PL{joint}}}^2 \PL{vec}(\B C)$, and so on. The total effect is therefore:
$$ \sum_{l=1}^{\infty} \B {P_{\PL{joint}}}^l = (\mathbf{I - P}_{\PL{joint}})^{-1} $$ where this power-series of direct and indirect effects then results in the Leontief matrix (as long as the $\mathbf{I - P}$ is invertible).
We can use this expression to calculate the matrix $\B D$ represents the ongoing effect of initial conditions. It is constructed from a $J$ length vector of estimated initial conditions $\B \delta_1$ in time $t=1$, and we construct a $T \times J$ matrix $\B \Delta$ where the first row (corresponding to year $t=1$) is $\B \delta_1$ and all other elements are $0$. The ongoing effect of initial conditions can then be calculated as:
$$ \PL{vec}(\B D) = (\mathbf{I - P}{\PL{joint}})^{-1} \PL{vec}(\B \Delta) $$ Calculating the effect of initial conditions is in a sense the total effect of $\B \delta_1$ in year $t=1$ on subsequent years. Calculating the effect $\B D$ of initial conditions involves inverting $\mathbf{I - P}{\PL{joint}}$, but this is computationally efficient using a sparse LU decomposition.
We have defined the joint precision for a GMRF based on a path matrix and matrix of exogenous covariances. The exogenous (or conditional) variances are stationary for each variable over time, and some path matrices will result in a nonstationary marginal variance. To see this, consider a first-order autoregressive process
dsem = " x -> x, 1, ar1, 0.8 x <-> x, 0, sd, 1 "
We can parse this DSEM and construct the precision using internal functions:
# Load package library(dsem) # call dsem without estimating parameters out = dsem( tsdata = ts(data.frame( x = rep(1,10) )), sem = dsem, control = dsem_control( run_model = FALSE, quiet = TRUE ) ) # Extract covariance Sigma1 = solve(as.matrix(out$obj$report()$Q_kk)) plot( x=1:10, y = diag(Sigma1), xlab="time", ylab="Marginal variance", type="l", ylim = c(0,max(diag(Sigma1))))
where we can see that the diagonal of this covariance matrix is non-constant.
We therefore derive an alternative specification that preserves a stationary marginal variance by rescaling the exogenous (conditional) variance.
Specifically, we see that the marginal variance is:
$$ \PL{Var}(\B X) = \PL{diag}(\B \Sigma) = \mathbf{ L L}^T \ \B L = (\mathbf{I - P}_{\PL{joint}})^{-1} \B G $$ Given the properties of the Hadamard (elementwise) product, this can be rewritten as:
$$ \PL{diag}(\B \Sigma) = \mathbf{(L \circ L) 1} $$ Now suppose we have a desired vector of length $TJ$ for the constant marginal variance $\B v$. We can solve for the exogenous covariance that would result in that marginal variance:
$$ \B u = \mathbf{(L \circ L)}^{-1} \B v $$
and we can then rescale the exogenous covariance:
$$ \PL{diag}(\B G^*) = \B u $$ and then using this rescaled exogenous covariance when constructing the precision of the GMRF. We can see this again using our first-order autoregressive example
# call dsem without estimating parameters out = dsem( tsdata = ts(data.frame( x = rep(1,10) )), sem = dsem, control = dsem_control( run_model = FALSE, quiet = TRUE, constant_variance = "marginal" ) ) # Extract covariance Sigma2 = solve(as.matrix(out$obj$report()$Q_kk)) plot( x=1:10, y = diag(Sigma2), xlab="time", ylab="Marginal variance", type="l", ylim = c(0,max(diag(Sigma1))))
This shows that the corrected (nonstationary) exogenous variance results in a stationary marginal variance for the AR1 process. This correction can be done in two different ways that are identical when the exogenous covariance is diagonal (as it is in this simple example), but differ when specifying some exogenous covariance. However, we do not discuss this in detail here. Note that this calculating this correction for a constant marginal variance requires the inverse of the squared values of Leontief matrix (which is itself a matrix inverse). It therefore is computationally expensive for large models containing complicated dependencies.
We note that some DSEM specifications will be reduced rank. This arises for example when specifying a dynamic factor analysis, where $J$ variables are explained by $F < J$ factors that each follow a random walk:
# dsem = " # Factor follows random walk with unit variance F <-> F, 0, NA, 1 F -> F, 1, NA, 1 # Loadings on two manifest variables F -> x, 0, b_x, 1 F -> y, 0, b_y, 1 # No residual variance for manifest variables x <-> x, 0, NA, 0 y <-> y, 0, NA, 0 " data = data.frame( x = rnorm(10), y = rnorm(10), F = rep(NA,10) ) # call dsem without estimating parameters out = dsem( tsdata = ts(data), sem = dsem, family = c("normal","normal","fixed"), control = dsem_control( run_model = FALSE, quiet = TRUE, gmrf_parameterization = "projection" ) )
We can extract the covariance and inspect the eigenvalues:
# Extract covariance library(Matrix) IminusRho_kk = out$obj$report()$IminusRho_kk G_kk = out$obj$report()$Gamma_kk Q_kk = t(IminusRho_kk) %*% t(G_kk) %*% G_kk %*% IminusRho_kk # Display eigenvalues eigen(Q_kk)$values
where this shows that the precision has a rank of 10 while being a $30 \times 30$ matrix. We therefore cannot evaluate the probability density of state matrix $\B X$ using this precision matrix (i.e., the log-determinant is not defined).
To address this circumstance, we can switch to using gmrf_parameterization = "projection"
. This evaluates the probability density of a set of innovations $\B X^*$ that follow a unit variance:
$$ \PL{vec}(\B X)^* \sim \PL{GMRF}(\B{0, I}) $$ and then projects these full-rank innovations to the reduced rank states:
$$ \PL{vec}(\B X) = (\mathbf{I - P}{\PL{joint}})^{-1} \B G{\PL{joint}} \PL{vec}(\B X)^* $$
This parameterization allows us to fit DSEM using a rank-deficient structural model.
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.