mvProbit: Estimation of Multivariate Probit Models
In mvProbit: Multivariate Probit Models

Description Usage Arguments Details Value Author(s) References See Also Examples

View source: R/mvProbit.R

Estimating multivariate probit models by the maximum likelihood method.

WARNING: this function is experimental and extremely (perhaps even unusably) slow!

mvProbit( formula, data, start = NULL, startSigma = NULL, 
   method = "BHHH", finalHessian = "BHHH", 
   algorithm = "GHK", nGHK = 1000, 
   intGrad = TRUE, oneSidedGrad = FALSE, eps = 1e-6, 
   random.seed = 123, ... )

## S3 method for class 'mvProbit'
print( x, digits = 4, ... )

`formula`	a `"formula"`: a symbolic description of the model (currently, all binary outcome variables must have the same regressors).
`data`	a `data.frame` containing the data.
`start`	an optional numeric vector specifying the starting values for the model coefficients; if argument `startSigma` is not specified, this vector can also include the correlation coefficients; the order of elements is explained in the section “details”; if this argument is not specified, coefficients estimated by univariate probit models are used as starting values for the model coefficients.
`startSigma`	optional starting values for the covariance/correlation matrix of the residuals (must be symmetric and have ones on its diagonal); if this argument is not specified and the starting values for the correlation coefficients are not included in argument `start`, the correlation matrix of the ‘response’ residuals, i.e. y - pnorm( X' beta ), is used as starting values for sigma.
`method`	maximisation method / algorithm (see `maxLik`).
`finalHessian`	Calculation of the final Hessian: either `FALSE` (no calculation of Hessian), `TRUE` (finite-distance calculation of Hessian), or `"BHHH"` (calculation based on information equality approach and finite-distance gradients, the default).
`algorithm`	algorithm for computing integrals of the multivariate normal distribution, either function `GenzBretz()`, `Miwa()`, or `TVPACK()` (see documentation of `pmvnorm`) or character string `"GHK"` (see documentation of `ghkvec`).
`nGHK`	numeric value specifying the number of simulation draws of the GHK algorithm for computing integrals of the multivariate normal distribution.
`intGrad`	logical. If `TRUE`, the computation of the gradients with respect to the estimated parameters is done internally in function `mvProbitLogLik` when it computes the log-likelihood values. If the optimization method requires gradients and this argument is `FALSE`, `maxLik` computes the gradients by `numericGradient`, which is usually slower than the calculation in function `mvProbitLogLik`. This argument should be set to `FALSE` if an optimisation algorithm is used that is not based on gradients.
`oneSidedGrad`	logical. If this argument and argument `intGrad` are both `TRUE`, the gradients of the log-likelihood function with respect to the estimated parameters are obtained by one-sided numeric finit-difference differentiation, which is faster but less precise than two-sided numeric finit-difference differentiation.
`eps`	numeric. The step size for the one-sided numeric finit-distance differentiation. Unfortunately, it is currently not possible to set the step size for the two-sided numeric finit-distance differentiation.
`random.seed`	an integer used to seed R's random number generator; this is to ensure replicability when computing (cumulative) probabilities of the multivariate normal distribution which is required to calculate the log likelihood values; `set.seed( random.seed )` is called each time before a (cumulative) probability of the multivariate normal distribution is computed; defaults to 123.
`x`	object of class `mvProbit` (returned by `mvProbit`).
`digits`	positive integer specifiying the minimum number of significant digits to be printed (see `print.default`).
`...`	additional arguments to `mvProbit` are passed to `maxLik` and `pmvnorm`; additional arguments to `print.mvProbit` are currently ignored.

It is possible to specify starting values (a) both for the model coefficients and the correlation coefficients (using argument start alone or arguments start and startSigma together), (b) only for the model coefficients (using argument start alone), or (c) only for the correlation coefficients (using argument startSigma alone).

If the model has n dependent variables (equations) and k explanatory variables in each equation, the order of the starting values in argument start must be as follows: b_{1,1}, ..., b_{1,k}, b_{2,1}, ..., b_{2,k}, ..., b_{n,1}, ..., b_{n,k}, where b_{i,j} is the coefficient of the jth explanatory variable in the ith equation. If argument startSigma is not specified, argument start can additionally include following elements: R_{1,2}, R_{1,3}, R_{1,4}, ..., R_{1,n}, R_{2,3}, R_{2,4}, ..., R_{2,n}, ..., R_{n-1,n}, where R_{i,j} is the correlation coefficient corresponding to the ith and jth equation.

The ‘state’ (or ‘seed’) of R's random number generator is saved at the beginning of the mvProbit function and restored at the end of this function so that this function does not affect the generation of random numbers outside this function although the random seed is set to argument random.seed and the calculation of the (cumulative) multivariate normal distribution uses random numbers.

mvProbit returns an object of class "mvProbit" inheriting from class "maxLik". The returned object contains the same components as objects returned by maxLik and additionally the following components:

`call`	the matched call.
`start`	the vector of starting values.
`nDep`	the number of dependent variables.
`nReg`	the number of explanatory variables (regressors).
`nObs`	the number of observations.
`dummyVars`	vector of character strings indicating the names of explanatory variables that contain only zeros and ones or only `TRUE` and `FALSE`. It is `NULL`, if no explanatory variable is indentified as a dummy variable.

Arne Henningsen

Greene, W.H. (1996): Marginal Effects in the Bivariate Probit Model, NYU Working Paper No. EC-96-11. Available at http://ssrn.com/abstract=1293106.

mvProbitLogLik, mvProbitMargEff, probit, glm

## generate a simulated data set
set.seed( 123 )
# number of observations
nObs <- 50

# generate explanatory variables
xMat <- cbind( 
   const = rep( 1, nObs ),
   x1 = as.numeric( rnorm( nObs ) > 0 ),
   x2 = rnorm( nObs ) )

# model coefficients
beta <- cbind( c(  0.8,  1.2, -0.8 ),
               c( -0.6,  1.0, -1.6 ),
               c(  0.5, -0.6,  1.2 ) )

# covariance matrix of error terms
library( miscTools )
sigma <- symMatrix( c( 1, 0.2, 0.4, 1, -0.1, 1 ) )

# generate dependent variables
yMatLin <- xMat %*% beta 
yMat <- ( yMatLin + rmvnorm( nObs, sigma = sigma ) ) > 0
colnames( yMat ) <- paste( "y", 1:3, sep = "" )

# estimation (BHHH optimizer and GHK algorithm)
estResult <- mvProbit( cbind( y1, y2, y3 ) ~ x1 + x2,
   data = as.data.frame( cbind( xMat, yMat ) ), iterlim = 1, nGHK = 50 )
summary( estResult )

# same estimation with user-defined starting values
estResultStart <- mvProbit( cbind( y1, y2, y3 ) ~ x1 + x2,
   start = c( beta ), startSigma = sigma, 
   data = as.data.frame( cbind( xMat, yMat ) ), iterlim = 1, nGHK = 50 )
summary( estResultStart )