fitSTVAR | R Documentation |
fitSTVAR
estimates a reduced form smooth transition VAR model in two phases
or three phases. In additional ML estimation, also penalized ML estimation is available.
fitSTVAR(
data,
p,
M,
weight_function = c("relative_dens", "logistic", "mlogit", "exponential", "threshold",
"exogenous"),
weightfun_pars = NULL,
cond_dist = c("Gaussian", "Student", "ind_Student", "ind_skewed_t"),
parametrization = c("intercept", "mean"),
AR_constraints = NULL,
mean_constraints = NULL,
weight_constraints = NULL,
estim_method,
penalized,
penalty_params = c(0.05, 0.2),
allow_unstab,
min_obs_coef = 3,
sparse_grid = FALSE,
h = 0.001,
nrounds,
ncores = 2,
maxit = 2000,
seeds = NULL,
print_res = TRUE,
use_parallel = TRUE,
calc_std_errors = TRUE,
...
)
data |
a matrix or class |
p |
a positive integer specifying the autoregressive order |
M |
a positive integer specifying the number of regimes |
weight_function |
What type of transition weights
See the vignette for more details about the weight functions. |
weightfun_pars |
|
cond_dist |
specifies the conditional distribution of the model as |
parametrization |
|
AR_constraints |
a size |
mean_constraints |
Restrict the mean parameters of some regimes to be identical? Provide a list of numeric vectors
such that each numeric vector contains the regimes that should share the common mean parameters. For instance, if
|
weight_constraints |
a list of two elements, |
estim_method |
either |
penalized |
should penalized log-likelihood function be used that penalizes the log-likelihood function when
the parameter values are close the boundary of the stability region or outside it? If |
penalty_params |
a numeric vector with two positive elements specifying the penalization parameters: the first element determined how far from the boundary of the stability region the penalization starts (a number between zero and one, smaller number starts penalization closer to the boundary) and the second element is a tuning parameter for the penalization (a positive real number, a higher value penalizes non-stability more). |
allow_unstab |
If |
min_obs_coef |
In the LS/NLS step of the three phase estimation, the smallest accepted number of observations
(times variables) from each regime relative to the number of parameters in the regime. For models with AR constraints,
the number of AR matrix parameters in each regimes is simplisticly assumed to be |
sparse_grid |
should the grid of weight function values in LS/NLS estimation be more sparse (speeding up the estimation)? |
h |
the strictly positive difference used in the finite difference approximation of the gradient used in numerical optimization. |
nrounds |
the number of estimation rounds that should be performed. The default is |
ncores |
the number CPU cores to be used in parallel computing. |
maxit |
the maximum number of iterations in the variable metric algorithm. |
seeds |
a length |
print_res |
should summaries of estimation results be printed? |
use_parallel |
employ parallel computing? If |
calc_std_errors |
Calculate approximate standard errors (based on standard asymptotics)? |
... |
additional settings passed to the function |
If you wish to estimate a structural model, estimate first the reduced form model and then use the
use the function fitSSTVAR
to create (and estimate if necessary) the structural model
based on the estimated reduced form model.
three-phase estimation. With estim_method="three-phase"
(not available for models with relative_dens
weight function), an extra phase is added to the beginning of the two-phase estimation procedure:
the autoregressive and weight function parameters are first estimated by the method of (penalized) least squares. Then,
the rest of the parameters are estimated by (penalized) ML with the genetic algorithm conditionally on the LS estimates.
Finally, all the parameters are estimated by (penalized) ML by initializing a gradient based variable metric algorithm
from initial estimates obtained from the first two phases. This allows to use substantially decrease the required
number of estimation rounds, and thereby typically speeds up the estimation substantially. On the other hand, the three-phase
procedure tends to produce estimates close to the initial (penalized) LS estimates, while the two-phase procedure explores
the parameter space more thoroughly (when a large enough number of estimation rounds is ran).
Penalized estimation. The penalized estimation (penalized=TRUE
) maximizes the penalized log-likelihood function
in which a penalty term is added. The penalty term becomes nonzero when the parameter values are close to the boundary of the
stability region or outside it, it increases in the modulus of the eigenvalues of the companion form AR matrices of the regimes.
With allow_unstab=TRUE
, allowing for unstable estimates, it allows the estimation algorithms to explore the parameter space
outside the stability region, but with high enough penalization, the optimization algorithm eventually converges back to the
stability region. By default, penalized estimation (with unstable estimates allow) is used for estim_method="three-phase"
and not used for estim_method="two-phase"
.
The rest concerns both two-phase and three-phase procedures.\
Because of complexity and high multimodality of the log-likelihood function, it is not certain
that the estimation algorithm will end up in the global maximum point. When estim_method="two-phase"
,
it is expected that many of the estimation rounds will end up in some local maximum or a saddle point instead.
Therefore, a (sometimes very large) number of estimation rounds is required for reliable results
(when estim_method="three-phase"
substantially smaller number should be sufficient). Due to
identification problems and high complexity of the surface of the log-likelihood function, the estimation may
fail especially in the cases where the number of regimes is chosen too large.
The estimation process is computationally heavy and it might take considerably long time for large models to
estimate, particularly if estim_method="two-phase"
. Note that reliable estimation of model with
cond_dist == "ind_Student"
or "ind_skewed_t"
is more difficult than with Gaussian or Student's t
models due to the increased complexity.
If the iteration limit maxit
in the variable metric algorithm is reached, one can continue the estimation by
iterating more with the function iterate_more
. Alternatively, one may use the found estimates as starting values
for the genetic algorithm and employ another round of estimation (see ??GAfit
how to set up an initial population
with the dot parameters).
If the estimation algorithm performs poorly, it usually helps to scale the individual series so that they vary roughly in the same scale. This makes it is easier to draw reasonable AR coefficients and (with some weight functions) weight parameter values in the genetic algorithm. Even if the estimation algorithm somewhat works, it should be preferred to scale the data so that most of the AR coefficients will not be very large, as the genetic algorithm works better with relatively small AR coefficients. If needed, another package can be used to fit linear VARs to the series to see which scaling of the series results in relatively small AR coefficients. You should avoid very small (or very high) variance in the data as well, so that the eigenvalues of the covariance matrices are in a reasonable range.
weight_constraints: If you are using weight constraints other than restricting some of the weight parameters to known constants, make sure the constraints are sensible. Otherwise, the estimation may fail due to the estimation algorithm not being able to generate reasonable random guesses for the values of the constrained weight parameters.
Filtering inappropriate estimates: fitSTVAR
automatically filters through estimates
that it deems "inappropriate". That is, estimates that are not likely solutions of interest.
Specifically, solutions that incorporate a near-singular error term covariance matrix (any eigenvalue less than 0.002
),
any modulus of the eigenvalues of the companion form AR matrices larger than $0.9985$ (indicating the necessary condition for
stationarity is close to break), or transition weights such that they are close to zero for almost all t
for at least
one regime. You can also always find the solutions of interest yourself by using the function alt_stvar
as well since
results from all estimation rounds are saved).
Returns an S3 object of class 'stvar'
defining a smooth transition VAR model. The returned list
contains the following components (some of which may be NULL
depending on the use case):
data |
The input time series data. |
model |
A list describing the model structure. |
params |
The parameters of the model. |
std_errors |
Approximate standard errors of the parameters, if calculated. |
transition_weights |
The transition weights of the model. |
regime_cmeans |
Conditional means of the regimes, if data is provided. |
total_cmeans |
Total conditional means of the model, if data is provided. |
total_ccovs |
Total conditional covariances of the model, if data is provided. |
uncond_moments |
A list of unconditional moments including regime autocovariances, variances, and means. |
residuals_raw |
Raw residuals, if data is provided. |
residuals_std |
Standardized residuals, if data is provided. |
structural_shocks |
Recovered structural shocks, if applicable. |
loglik |
Log-likelihood of the model, if data is provided. |
IC |
The values of the information criteria (AIC, HQIC, BIC) for the model, if data is provided. |
all_estimates |
The parameter estimates from all estimation rounds, if applicable. |
all_logliks |
The log-likelihood of the estimates from all estimation rounds, if applicable. |
which_converged |
Indicators of which estimation rounds converged, if applicable. |
which_round |
Indicators of which round of optimization each estimate belongs to, if applicable. |
seeds |
The seeds used in the estimation in |
LS_estimates |
The least squares estimates of the parameters in the form
|
The following S3 methods are supported for class 'stvar'
: logLik
, residuals
, print
, summary
,
predict
, simulate
, and plot
.
Anderson H., Vahid F. 1998. Testing multiple equation systems for common nonlinear components. Journal of Econometrics, 84:1, 1-36.
Hubrich K., Teräsvirta. T. 2013. Thresholds and Smooth Transitions in Vector Autoregressive Models. CREATES Research Paper 2013-18, Aarhus University.
Lanne M., Virolainen S. 2025. A Gaussian smooth transition vector autoregressive model: An application to the macroeconomic effects of severe weather shocks. Unpublished working paper, available as arXiv:2403.14216.
Kheifets I.L., Saikkonen P.J. 2020. Stationarity and ergodicity of Vector STAR models. Econometric Reviews, 39:4, 407-414.
Tsay R. 1998. Testing and Modeling Multivariate Threshold Models. Journal of the American Statistical Association, 93:443, 1188-1202.
Virolainen S. 2025. Identification by non-Gaussianity in structural threshold and smooth transition vector autoregressive models. Unpublished working paper, available as arXiv:2404.19707.
fitSSTVAR
, STVAR
, GAfit
, iterate_more
, filter_estimates
## These are long running examples. Running all the below examples will take
## approximately three minutes.
# When estimating the models in empirical applications, typically a large number
# of estimation rounds (set by the argument 'nrounds') should be used. These examples
# use only a small number of rounds to make the running time of the examples reasonable.
# The below examples make use of the two-variate dataset 'gdpdef' containing
# the the quarterly U.S. GDP and GDP deflator from 1947Q1 to 2019Q4.
# Estimate Gaussian STVAR model of autoregressive order p=3 and two regimes (M=2),
# with the weighted relative stationary densities of the regimes as the transition
# weight function. The estimation is performed with 2 rounds and 2 CPU cores, with
# the random number generator seeds set for reproducibility (two-phase estimation):
fit32 <- fitSTVAR(gdpdef, p=3, M=2, weight_function="relative_dens", cond_dist="Gaussian",
nrounds=2, ncores=2, seeds=1:2)
# Examine the results:
fit32 # Printout of the estimates
summary(fit32) # A more detailed summary printout
plot(fit32) # Plot the fitted transition weights
get_foc(fit32) # Gradient of the log-likelihood function about the estimate
get_soc(fit32) # Eigenvalues of the Hessian of the log-lik. fn. about the estimate
profile_logliks(fit32) # Profile log-likelihood functions about the estimate
# Estimate a two-regime Student's t STVAR p=5 model with logistic transition weights
# and the first lag of the second variable as the switching variable, only two
# estimation rounds using two CPU cores (three-phase estimation):
fitlogistict32 <- fitSTVAR(gdpdef, p=3, M=2, weight_function="logistic", weightfun_pars=c(2, 1),
cond_dist="Student", nrounds=2, ncores=2, seeds=1:2)
summary(fitlogistict32) # Summary printout of the estimates
# Estimate a two-regime threshold VAR p=3 model with independent skewed t shocks
# using the three-phase estimation procedure.
# The first lag of the the second variable is specified as the switching variable,
# and the threshold parameter constrained to the fixed value 1 (three-phase estimation):
fitthres32wit <- fitSTVAR(gdpdef, p=3, M=2, weight_function="threshold", weightfun_pars=c(2, 1),
cond_dist="ind_skewed_t", weight_constraints=list(R=0, r=1), nrounds=2, ncores=2, seeds=1:2)
plot(fitthres32wit) # Plot the fitted transition weights
# Estimate a two-regime STVAR p=1 model with exogenous transition weights defined as the indicator
# of NBER based U.S. recessions (source: St. Louis Fed database). Moreover, restrict the AR matrices
# to be identical across the regimes (i.e., allowing for time-variation in the intercepts and the
# covariance matrix only), (three-phase estimation):
# Step 1: Define transition weights of Regime 1 as the indicator of NBER based U.S. recessions
# (the start date of weights is start of data + p, since the first p values are used as the initial
# values):
tw1 <- c(0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1,
1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 1, 1, 1, 1,
1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0)
# Step 2: Define the transition weights of Regime 2 as one minus the weights of Regime 1, and
# combine the weights to matrix of transition weights:
twmat <- cbind(tw1, 1 - tw1)
# Step 3: Create the appropriate constraint matrix:
C_122 <- rbind(diag(1*2^2), diag(1*2^2))
# Step 4: Estimate the model by specifying the weights in the argument 'weightfun_pars'
# and the constraint matrix in the argument 'AR_constraints':
fitexo12cit <- fitSTVAR(gdpdef, p=1, M=2, weight_function="exogenous", weightfun_pars=twmat,
cond_dist="ind_Student", AR_constraints=C_122, nrounds=2, ncores=2, seeds=1:2)
plot(fitexo12cit) # Plot the transition weights
summary(fitexo12cit) # Summary printout of the estimates
# Estimate a two-regime Gaussian STVAR p=1 model with the weighted relative stationary densities
# of the regimes as the transition weight function; constrain AR matrices to be identical
# across the regimes and also constrain the off-diagonal elements of the AR matrices to be zero.
# Moreover, constrain the unconditional means of both regimes to be equal (two-phase estimation):
mat0 <- matrix(c(1, rep(0, 10), 1, rep(0, 8), 1, rep(0, 10), 1), nrow=2*2^2, byrow=FALSE)
C_222 <- rbind(mat0, mat0) # The constraint matrix
fit22cm <- fitSTVAR(gdpdef, p=2, M=2, weight_function="relative_dens", cond_dist="Gaussian",
parametrization="mean", AR_constraints=C_222, mean_constraints=list(1:2), nrounds=2, seeds=1:2)
fit22cm # Print the estimates
# Estimate a two-regime Student's t STVAR p=3 model with logistic transition weights and the
# first lag of the second variable as the switching variable. Constraint the location parameter
# to the fixed value 1 and leave the scale parameter unconstrained (three-phase estimation):
fitlogistic32w <- fitSTVAR(gdpdef, p=3, M=2, weight_function="logistic", weightfun_pars=c(2, 1),
cond_dist="Student", weight_constraints=list(R=matrix(c(0, 1), nrow=2), r=c(1, 0)), nrounds=2,
seeds=1:2)
plot(fitlogistic32w) # Plot the fitted transition weights
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.