knitr::opts_chunk$set(echo = TRUE, message = FALSE, warning = FALSE, widtht = 65) options(width = 65)
The multinomial probit is obtained with the same modeling that we used while presenting the random utility model. The utility of an alternative is still the sum of two components : $U_j = V_j + \epsilon_j$.
but the joint distribution of the error terms is now a multivariate normal with mean 0 and with a matrix of covariance denoted $\Omega$^[see [@HAUS:WISE:78] and [@DAGA:79]].
Alternative $l$ is chosen if : $$ \left{ \begin{array}{rcl} U_1-U_l&=&(V_1-V_l)+(\epsilon_1-\epsilon_l)<0\ U_2-U_l&=&(V_2-V_l)+(\epsilon_2-\epsilon_l)<0\ & \vdots & \ U_J-U_l&=&(V_J-V_l)+(\epsilon_J-\epsilon_l)<0\ \end{array} \right. $$
wich implies, denoting $V^l_j=V_j-V_l$ :
$$ \left{ \begin{array}{rclrcl} \epsilon^l_1 &=& (\epsilon_1-\epsilon_l) &<& - V^l_1\ \epsilon^l_2 &=& (\epsilon_2-\epsilon_l) &<& - V^l_2\ &\vdots & & \vdots & \ \epsilon^l_J &=& (\epsilon_J-\epsilon_l) &<& - V^l_J\ \end{array} \right. $$
The initial vector of errors $\epsilon$ are transformed using the following transformation :
$$\epsilon^l = M^l \epsilon$$
where the transformation matrix $M^l$ is a $(J-1) \times J$ matrix obtained by inserting in an identity matrix a $l^{\mbox{th}}$ column of $-1$. For example, if $J = 4$ and $l = 3$ :
$$ M^3 = \left( \begin{array}{cccc} 1 & 0 & -1 & 0 \ 0 & 1 & -1 & 0 \ 0 & 0 & -1 & 1 \ \end{array} \right) $$
The covariance matrix of the error differences is obtained using the following matrix :
$$ \mbox{V}\left(\epsilon^l\right)=\mbox{V}\left(M^l\epsilon\right) = M^l\mbox{V}\left(\epsilon\right){M^l}^{\top} = M^l\Omega{M^l}^{\top} $$
The probability of choosing $l$ is then :
\begin{equation} P_l =\mbox{P}(\epsilon^l_1<-V_1^l \;\&\; \epsilon^l_2<-V_2^l \;\&\; ... \; \epsilon^l_J<-V_J^l) \end{equation}
with the hypothesis of distribution, this writes :
\begin{equation} P_l = \int_{-\infty}^{-V_1^l}\int_{-\infty}^{-V_2^l}...\int_{-\infty}^{-V_J^l}\phi(\epsilon^l) d\epsilon^l_1 d\epsilon^l_2... d^l_J \end{equation}
with :
\begin{equation} \phi\left(\epsilon^l\right)=\frac{1}{(2\pi)^{(J-1)/2}\mid\Omega^l\mid^{1/2}} e^{-\frac{1}{2}\epsilon^l{\Omega^l}^{-1}\epsilon^l} \end{equation}
Two problems arise with this model :
The meaning-full parameters are those of the covariance matrix of the error $\Omega$. For example, with $J = 3$ :
$$ \Omega = \left( \begin{array}{ccc} \sigma_{11} & \sigma_{12} & \sigma_{13} \ \sigma_{21} & \sigma_{22} & \sigma_{23} \ \sigma_{31} & \sigma_{32} & \sigma_{33} \ \end{array} \right) $$
$$ \Omega^1 = M^1 \Omega {M^1}^{\top}= \left( \begin{array}{cc} \sigma_{11}+\sigma_{22}-2\sigma_{12} & \sigma_{11} + \sigma_{23} - \sigma_{12} -\sigma_{13} \ \sigma_{11}+\sigma_{23}- \sigma_{12} - \sigma_{13} & \sigma_{11} + \sigma_{33} - 2 \sigma_{13} \ \end{array} \right) $$
The overall scale of utility being unidentified, one has to impose the value of one of the variance, for example the first one is fixed to 1. We then have :
$$ \Omega^1 = \left( \begin{array}{cc} 1 & \frac{\sigma_{11}+ \sigma_{23} - \sigma_{12} -\sigma_{13}}{\sigma_{11}+\sigma_{22}-2\sigma_{12}} \ \frac{\sigma_{11}+\sigma_{23}- \sigma_{12} - \sigma_{13}}{\sigma_{11}+\sigma_{22}-2\sigma_{12}} & \frac{\sigma_{11} + \sigma_{33} - 2 \sigma_{13}}{\sigma_{11}+\sigma_{22}-2\sigma_{12}} \ \end{array} \right) $$
Therefore, out the 6 structural parameters of the covariance matrix, only 3 can be identified. Moreover, it's almost impossible to interpret these parameters.
More generally, with $J$ alternatives, the number of the parameters of the covariance matrix is $(J+1)\times J/2$ and the number of identified parameters is $J\times(J-1)/2-1$.
Let $L^l$ be the Choleski decomposition of the covariance matrix of the error differences :
$$ \Omega^l=L^l {L^l}^{\top} $$
This matrix is a lower triangular matrix of dimension $(J-1)$ :
$$ L^l= \left( \begin{array}{ccccc} l_{11} & 0 & 0 &... & 0 \ l_{21} & l_{22} & 0 & ... & 0 \ l_{31} & l_{32} & l_{33} & ... & 0 \ \vdots & \vdots & \vdots & \ddots & \vdots \ l_{(J-1)1} & l_{(J-1)2} & l_{(J-1)3} & ... & l_{(J-1)(J-1)} \ \end{array} \right) $$
Let $\eta$ be a vector of standard normal deviates :
$$ \eta \sim N(0, I) $$
Therefore, we have :
$$ \mbox{V}\left(L^l\eta\right)=L^lV(\eta){L^l}^{\top}=L^lI{L^l}^{\top}=\Omega^l $$
Therefore, if we draw a vector of standard normal deviates $\eta$ and apply to it this transformation, we get a realization of $\epsilon^l$.
This joint probability can be written as a product of conditional and marginal probabilities :
$$ \begin{array}{rcl} P_l &=& \mbox{P}(\epsilon^l_1<- V_1^l \;\&\; \epsilon^l_2<-V_2^l \;\&\; ... \;\&\; \epsilon^l_J<-V_J^l))\ &=& \mbox{P}(\epsilon^l_1<- V_1^l))\ &\times&\mbox{P}(\epsilon^l_2<-V_2^l \mid \epsilon^l_1<-V_1^l) \ &\times&\mbox{P}(\epsilon^l_3<-V_3^l \mid \epsilon^l_1<-V_1^l \;\&\; \epsilon^l_2<-V_2^l) \ & \vdots & \ &\times&\mbox{P}(\epsilon^l_J<-V_J^l \mid \epsilon^l_1<-V_1^l \;\&\; ... \;\&\; \epsilon^l_{J-1}<-V_{J-1}^l)) \ \end{array} $$
The vector of error differences deviates is :
$$ \left( \begin{array}{c} \epsilon^l_1 \ \epsilon^l_2 \ \epsilon^l_3 \ \vdots \ \epsilon^l_J \end{array} \right) = \left( \begin{array}{ccccc} l_{11} & 0 & 0 &... & 0 \ l_{21} & l_{22} & 0 & ... & 0 \ l_{31} & l_{32} & l_{33} & ... & 0 \ \vdots & \vdots & \vdots & \ddots & \vdots \ l_{(J-1)1} & l_{(J-1)2} & l_{(J-1)3} & ... & l_{(J-1)(J-1)} \ \end{array} \right) \times \left( \begin{array}{c} \eta_1 \ \eta_2 \ \eta_3 \ \vdots \ \eta_J \end{array} \right) $$
$$ \left( \begin{array}{c} \epsilon^l_1 \ \epsilon^l_2 \ \epsilon^l_3 \ \vdots \ \epsilon^l_J \end{array} \right) = \left( \begin{array}{l} l_{11}\eta_1 \ l_{21}\eta_1+l_{22}\eta_2 \ l_{31}\eta_1+l_{32}\eta_2 + l_{33}\eta_3\ \vdots \ l_{(J-1)1}\eta_1+l_{(J-1)2}\eta_2+...+l_{(J-1)(J-1)}\eta_{J-1} \end{array} \right) $$
Let's now investigate the marginal and conditional probabilities :
This probabilities can easily be simulated by drawing numbers from a truncated normal distribution.
This so called GHK algorithm^[see for example [@GEWE:KEAN:RUNK:94].] (for Geweke, Hajivassiliou and Keane who developed this algorithm) can be described as follow :
\begin{enumerate} - compute $\Phi\left(-\frac{V_1^l}{l_{11}}\right)$ - draw a number called $\eta_1^r$ from a standard normal distribution upper-truncated at $-\frac{V_1^l}{l_{11}}$ and compute $\Phi\left(-\frac{V^l_2+l_{21}\eta_1^r}{l_{22}}\right)$ - draw a number called $\eta_2^r$ from a standard normal distribution upper-truncated at $-\frac{V^l_2+l_{21}\eta_1^r}{l_{22}}$ and compute $\Phi\left(-\frac{V^l_3+l_{31}\eta_1^r+l_{32}\eta_2^r}{l_{33}}\right)$ - $...$ draw a number called $\eta_{J-1}^r$ from a standard normal distribution upper-truncated at $-\frac{V^l_{J-1}+l_{(J-1)1}\eta_1^r+... V^l_{J-1}+l_{(J-1)(J-2)}\eta_{J-2}^r}{l_{(J-1)(J-1)}}$ - multiply all these probabilities and get a realization of the probability called $P^r_l$. - repeat all these steps many times and average all these probabilities ; this average is an estimation of the probability : $\bar{P}l = \sum{r=1}^R P^r_l/R$. \end{enumerate}
Several points should be noted concerning this algorithm :
We use again the Fishing
data frame, with only a subset of three
alternatives used. The multinomial probit model is estimated using
mlogit
with the probit
argument equal to TRUE
.
library("mlogit") data("Fishing", package = "mlogit") Fish <- dfidx(Fishing, varying = 2:9, choice = "mode", idnames = c("chid", "alt"))
Fish.mprobit <- mlogit(mode~price | income | catch, Fish, probit = TRUE, alt.subset=c('beach', 'boat','pier'))
summary(Fish.mprobit)
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.