fitGSMVAR: Two-phase maximum likelihood estimation of a GMVAR, StMVAR,...

View source: R/MAINest.R

fitGSMVARR Documentation

Two-phase maximum likelihood estimation of a GMVAR, StMVAR, or G-StMVAR model


fitGSMVAR estimates a GMVAR, StMVAR, or G-StMVAR model model in two phases: in the first phase it uses a genetic algorithm to find starting values for a gradient based variable metric algorithm, which it then uses to finalize the estimation in the second phase. Parallel computing is utilized to perform multiple rounds of estimations in parallel.


  model = c("GMVAR", "StMVAR", "G-StMVAR"),
  conditional = TRUE,
  parametrization = c("intercept", "mean"),
  constraints = NULL,
  same_means = NULL,
  weight_constraints = NULL,
  structural_pars = NULL,
  ncalls = (M + 1)^5,
  ncores = 2,
  maxit = 1000,
  seeds = NULL,
  print_res = TRUE,
  use_parallel = TRUE,
  filter_estimates = TRUE,
  calc_std_errors = TRUE,



a matrix or class 'ts' object with d>1 columns. Each column is taken to represent a univariate time series. NA values are not supported.


a positive integer specifying the autoregressive order of the model.

For GMVAR and StMVAR models:

a positive integer specifying the number of mixture components.

For G-StMVAR models:

a size (2x1) integer vector specifying the number of GMVAR type components M1 in the first element and StMVAR type components M2 in the second element. The total number of mixture components is M=M1+M2.


is "GMVAR", "StMVAR", or "G-StMVAR" model considered? In the G-StMVAR model, the first M1 components are GMVAR type and the rest M2 components are StMVAR type.


a logical argument specifying whether the conditional or exact log-likelihood function


"intercept" or "mean" determining whether the model is parametrized with intercept parameters \phi_{m,0} or regime means \mu_{m}, m=1,...,M.


a size (Mpd^2 x q) constraint matrix C specifying general linear constraints to the autoregressive parameters. We consider constraints of form (\phi_{1},...,\phi_{M}) = C \psi, where \phi_{m} = (vec(A_{m,1}),...,vec(A_{m,p}) (pd^2 x 1), m=1,...,M, contains the coefficient matrices and \psi (q x 1) contains the related parameters. For example, to restrict the AR-parameters to be the same for all regimes, set C= [I:...:I]' (Mpd^2 x pd^2) where I = diag(p*d^2). Ignore (or set to NULL) if linear constraints should not be employed.


Restrict the mean parameters of some regimes to be the same? Provide a list of numeric vectors such that each numeric vector contains the regimes that should share the common mean parameters. For instance, if M=3, the argument list(1, 2:3) restricts the mean parameters of the second and third regime to be the same but the first regime has freely estimated (unconditional) mean. Ignore or set to NULL if mean parameters should not be restricted to be the same among any regimes. This constraint is available only for mean parametrized models; that is, when parametrization="mean".


a numeric vector of length M-1 specifying fixed parameter values for the mixing weight parameters \alpha_m, \ m=1,...,M-1. Each element should be strictly between zero and one, and the sum of all the elements should be strictly less than one.


If NULL a reduced form model is considered. Reduced models can be used directly as recursively identified structural models. For a structural model identified by conditional heteroskedasticity, should be a list containing at least the first one of the following elements:

  • W - a (dxd) matrix with its entries imposing constraints on W: NA indicating that the element is unconstrained, a positive value indicating strict positive sign constraint, a negative value indicating strict negative sign constraint, and zero indicating that the element is constrained to zero.

  • C_lambda - a (d(M-1) x r) constraint matrix that satisfies (\lambda_{2},..., \lambda_{M}) = C_{\lambda} \gamma where \gamma is the new (r x 1) parameter subject to which the model is estimated (similarly to AR parameter constraints). The entries of C_lambda must be either positive or zero. Ignore (or set to NULL) if the eigenvalues \lambda_{mi} should not be constrained.

  • fixed_lambdas - a length d(M-1) numeric vector (\lambda_{2},..., \lambda_{M}) with elements strictly larger than zero specifying the fixed parameter values for the parameters \lambda_{mi} should be constrained to. This constraint is alternative C_lambda. Ignore (or set to NULL) if the eigenvalues \lambda_{mi} should not be constrained.

See Virolainen (forthcoming) for the conditions required to identify the shocks and for the B-matrix as well (it is W times a time-varying diagonal matrix with positive diagonal entries).


the number of estimation rounds that should be performed.


the number CPU cores to be used in parallel computing.


the maximum number of iterations in the variable metric algorithm.


a length ncalls vector containing the random number generator seed for each call to the genetic algorithm, or NULL for not initializing the seed. Exists for creating reproducible results.


should summaries of estimation results be printed?


employ parallel computing? If FALSE, no progression bar etc will be generated.


should the likely inappropriate estimates be filtered? See details.


calculate approximate standard errors for the estimates?


additional settings passed to the function GAfit employing the genetic algorithm.


If you wish to estimate a structural model without overidentifying constraints that is identified statistically, specify your W matrix is structural_pars to be such that it contains the same sign constraints in a single row (e.g. a row of ones) and leave the other elements as NA. In this way, the genetic algorithm works the best. The ordering and signs of the columns of the W matrix can be changed afterwards with the functions reorder_W_columns and swap_W_signs.

Because of complexity and high multimodality of the log-likelihood function, it's not certain that the estimation algorithms will end up in the global maximum point. It's expected that most of the estimation rounds will end up in some local maximum or saddle point instead. Therefore, a (sometimes large) number of estimation rounds is required for reliable results. Because of the nature of the model, the estimation may fail especially in the cases where the number of mixture components is chosen too large. With two regimes and couple hundred observations in a two-dimensional time series, 50 rounds is usually enough. Several hundred estimation rounds often suffices for reliably fitting two-regimes models to 3 or 4 dimensional time series. With more than two regimes and more than couple hundred observations, thousands of estimation rounds (or more) are often required to obtain reliable results.

The estimation process is computationally heavy and it might take considerably long time for large models with large number of observations. If the iteration limit maxit in the variable metric algorithm is reached, one can continue the estimation by iterating more with the function iterate_more. Alternatively, one may use the found estimates as starting values for the genetic algorithm and and employ another round of estimation (see ?GAfit how to set up an initial population with the dot parameters).

If the estimation algorithm fails to create an initial population for the genetic algorithm, it usually helps to scale the individual series so that the AR coefficients (of a VAR model) will be relative small, preferably less than one. Even if one is able to create an initial population, it should be preferred to scale the series so that most of the AR coefficients will not be very large, as the estimation algorithm works better with relatively small AR coefficients. If needed, another package can be used to fit linear VARs to the series to see which scaling of the series results in relatively small AR coefficients.

The code of the genetic algorithm is mostly based on the description by Dorsey and Mayer (1995) but it includes some extra features that were found useful for this particular estimation problem. For instance, the genetic algorithm uses a slightly modified version of the individually adaptive crossover and mutation rates described by Patnaik and Srinivas (1994) and employs (50%) fitness inheritance discussed by Smith, Dike and Stegmann (1995).

The gradient based variable metric algorithm used in the second phase is implemented with function optim from the package stats.

Note that the structural models are even more difficult to estimate than the reduced form models due to the different parametrization of the covariance matrices, so larger number of estimation rounds should be considered. Also, be aware that if the lambda parameters are constrained in any other way than by restricting some of them to be identical, the parameter "lambda_scale" of the genetic algorithm (see ?GAfit) needs to be carefully adjusted accordingly. When estimating a structural model that imposes overidentifiying constraints to a time series with d>3, it is highly recommended to create an initial population based on the estimates of a statistically identified model (when M=2). This is because currently obtaining the ML estimate reliably to such a structural model seems difficult in many application.

Finally, the function fails to calculate approximate standard errors and the parameter estimates are near the border of the parameter space, it might help to use smaller numerical tolerance for the stationarity and positive definiteness conditions. The numerical tolerance of an existing model can be changed with the function update_numtols.

Filtering inappropriate estimates: If filter_estimates == TRUE, the function will automatically filter out estimates that it deems "inappropriate". That is, estimates that are not likely solutions of interest. Specifically, solutions that incorporate a near-singular error term covariance matrix (any eigenvalue less than 0.002), mixing weights such that they are close to zero for almost all t for at least one regime, or mixing weight parameter estimate close to zero (or one). It also filters out estimates with any modulus "bold A" eigenvalues larger than 0.9985, as the solution is near the boundary of the stationarity region and likely not a local maximum. You can also set filter_estimates=FALSE and find the solutions of interest yourself by using the function alt_gsmvar.


Returns an object of class 'gsmvar' defining the estimated (reduced form or structural) GMVAR, StMVAR, or G-StMVAR model. Multivariate quantile residuals (Kalliovirta and Saikkonen 2010) are also computed and included in the returned object. In addition, the returned object contains the estimates and log-likelihood values from all the estimation rounds performed. The estimated parameter vector can be obtained at gsmvar$params (and corresponding approximate standard errors at gsmvar$std_errors). See ?GSMVAR for the form of the parameter vector, if needed.

Remark that the first autocovariance/correlation matrix in $uncond_moments is for the lag zero, the second one for the lag one, etc.

S3 methods

The following S3 methods are supported for class 'gsmvar': logLik, residuals, print, summary, predict, simulate, and plot.


  • Dorsey R. E. and Mayer W. J. 1995. Genetic algorithms for estimation problems with multiple optima, nondifferentiability, and other irregular features. Journal of Business & Economic Statistics, 13, 53-66.

  • Kalliovirta L., Meitz M. and Saikkonen P. 2016. Gaussian mixture vector autoregression. Journal of Econometrics, 192, 485-498.

  • Patnaik L.M. and Srinivas M. 1994. Adaptive Probabilities of Crossover and Mutation in Genetic Algorithms. Transactions on Systems, Man and Cybernetics 24, 656-667.

  • Smith R.E., Dike B.A., Stegmann S.A. 1995. Fitness inheritance in genetic algorithms. Proceedings of the 1995 ACM Symposium on Applied Computing, 345-350.

  • Virolainen S. (forthcoming). A statistically identified structural vector autoregression with endogenously switching volatility regime. Journal of Business & Economic Statistics.

  • Virolainen S. 2022. Gaussian and Student's t mixture vector autoregressive model with application to the asymmetric effects of monetary policy shocks in the Euro area. Unpublished working paper, available as arXiv:2109.13648.

See Also

GSMVAR, iterate_more, stmvar_to_gstmvar, predict.gsmvar, profile_logliks, simulate.gsmvar, quantile_residual_tests, print_std_errors, swap_parametrization, get_gradient, GIRF, GFEVD, LR_test, Wald_test, gsmvar_to_sgsmvar, stmvar_to_gstmvar, reorder_W_columns, swap_W_signs, cond_moment_plot, update_numtols


## These are long running examples that use parallel computing!
# Running all the below examples will take approximately 3-4 minutes.

# GMVAR(1,2) model: 10 estimation rounds with seeds set
# for reproducibility
fit12 <- fitGSMVAR(gdpdef, p=1, M=2, ncalls=10, seeds=1:10)

# The rest of the examples only use a single estimation round with a given
# seed that produces the MLE to reduce running time of the examples. When
# estimating models for empirical applications, a large number of estimation
# rounds (ncalls = a large number) should be performed to ensure reliability
# of the estimates (see the section details).

# StMVAR(2, 2) model
fit22t <- fitGSMVAR(gdpdef, p=2, M=2, model="StMVAR", ncalls=1, seeds=1)
fit22t # Overly large degrees of freedom estimate in the 2nd regime!
fit22gs <- stmvar_to_gstmvar(fit22t) # So switch it to GMVAR type!
fit22gs # This is the appropriate G-StMVAR model based on the above StMVAR model.
fit22gss <- gsmvar_to_sgsmvar(fit22gs) # Switch to structural model
fit22gss # This is the implied statistically identified structural model.

# Structural GMVAR(1,2) model identified with sign
# constraints.
W_122 <- matrix(c(1, 1, -1, 1), nrow=2)
fit12s <- fitGSMVAR(gdpdef, p=1, M=2, structural_pars=list(W=W_122),
  ncalls=1, seeds=1)
# A statistically identified structural model can also be obtained with
# gsmvar_to_sgsmvar(fit12)

# GMVAR(2,2) model with autoregressive parameters restricted
# to be the same for both regimes
C_mat <- rbind(diag(2*2^2), diag(2*2^2))
fit22c <- fitGSMVAR(gdpdef, p=2, M=2, constraints=C_mat, ncalls=1, seeds=1)

# G-StMVAR(2, 1, 1) model with autoregressive parameters and unconditional means restricted
# to be the same for both regimes:
fit22gscm <- fitGSMVAR(gdpdef, p=2, M=c(1, 1), model="G-StMVAR", constraints=C_mat,
   parametrization="mean", same_means=list(1:2), ncalls=1, seeds=1)

# GMVAR(2,2) model with autoregressive parameters restricted
# to be the same for both regimes and non-diagonal elements
# the coefficient matrices constrained to zero.
tmp <- matrix(c(1, rep(0, 10), 1, rep(0, 8), 1, rep(0, 10), 1),
 nrow=2*2^2, byrow=FALSE)
C_mat2 <- rbind(tmp, tmp)
fit22c2 <- fitGSMVAR(gdpdef, p=2, M=2, constraints=C_mat2, ncalls=1,

gmvarkit documentation built on May 29, 2024, 10:46 a.m.