mlogit: Multinomial logit model
In mlogit: Multinomial Logit Models

Description Usage Arguments Details Value Author(s) References See Also Examples

Estimation by maximum likelihood of the multinomial logit model, with alternative-specific and/or individual specific variables.

mlogit(
  formula,
  data,
  subset,
  weights,
  na.action,
  start = NULL,
  alt.subset = NULL,
  reflevel = NULL,
  nests = NULL,
  un.nest.el = FALSE,
  unscaled = FALSE,
  heterosc = FALSE,
  rpar = NULL,
  probit = FALSE,
  R = 40,
  correlation = FALSE,
  halton = NULL,
  random.nb = NULL,
  panel = FALSE,
  estimate = TRUE,
  seed = 10,
  ...
)

`formula`	a symbolic description of the model to be estimated,
`data`	the data: an `mlogit.data` object or an ordinary `data.frame`,
`subset`	an optional vector specifying a subset of observations for `mlogit`,
`weights`	an optional vector of weights,
`na.action`	a function which indicates what should happen when the data contains `NA`s,
`start`	a vector of starting values,
`alt.subset`	a vector of character strings containing the subset of alternative on which the model should be estimated,
`reflevel`	the base alternative (the one for which the coefficients of individual-specific variables are normalized to 0),
`nests`	a named list of characters vectors, each names being a nest, the corresponding vector being the set of alternatives that belong to this nest,
`un.nest.el`	a boolean, if `TRUE`, the hypothesis of unique elasticity is imposed for nested logit models,
`unscaled`	a boolean, if `TRUE`, the unscaled version of the nested logit model is estimated,
`heterosc`	a boolean, if `TRUE`, the heteroscedastic logit model is estimated,
`rpar`	a named vector whose names are the random parameters and values the distribution : `'n'` for normal, `'l'` for log-normal, `'t'` for truncated normal, `'u'` for uniform,
`probit`	if `TRUE`, a multinomial porbit model is estimated,
`R`	the number of function evaluation for the gaussian quadrature method used if `heterosc = TRUE`, the number of draws of pseudo-random numbers if `rpar` is not `NULL`,
`correlation`	only relevant if `rpar` is not `NULL`, if true, the correlation between random parameters is taken into account,
`halton`	only relevant if `rpar` is not `NULL`, if not `NULL`, halton sequence is used instead of pseudo-random numbers. If `halton = NA`, some default values are used for the prime of the sequence (actually, the primes are used in order) and for the number of elements droped. Otherwise, `halton` should be a list with elements `prime` (the primes used) and `drop` (the number of elements droped).
`random.nb`	only relevant if `rpar` is not `NULL`, a user-supplied matrix of random,
`panel`	only relevant if `rpar` is not `NULL` and if the data are repeated observations of the same unit ; if `TRUE`, the mixed-logit model is estimated using panel techniques,
`estimate`	a boolean indicating whether the model should be estimated or not: if not, the `model.frame` is returned,
`seed`	the seed to use for random numbers (for mixed logit and probit models),
`...`	further arguments passed to `mlogit.data` or `mlogit.optim`.

For how to use the formula argument, see mFormula().

The data argument may be an ordinary data.frame. In this case, some supplementary arguments should be provided and are passed to mlogit.data(). Note that it is not necessary to indicate the choice argument as it is deduced from the formula.

The model is estimated using the mlogit.optim(). function.

The basic multinomial logit model and three important extentions of this model may be estimated.

If heterosc=TRUE, the heteroscedastic logit model is estimated. J - 1 extra coefficients are estimated that represent the scale parameter for J - 1 alternatives, the scale parameter for the reference alternative being normalized to 1. The probabilities don't have a closed form, they are estimated using a gaussian quadrature method.

If nests is not NULL, the nested logit model is estimated.

If rpar is not NULL, the random parameter model is estimated. The probabilities are approximated using simulations with R draws and halton sequences are used if halton is not NULL. Pseudo-random numbers are drawns from a standard normal and the relevant transformations are performed to obtain numbers drawns from a normal, log-normal, censored-normal or uniform distribution. If correlation = TRUE, the correlation between the random parameters are taken into account by estimating the components of the cholesky decomposition of the covariance matrix. With G random parameters, without correlation G standard deviations are estimated, with correlation G * (G + 1) /2 coefficients are estimated.

An object of class "mlogit", a list with elements:

coefficients: the named vector of coefficients,
logLik: the value of the log-likelihood,
hessian: the hessian of the log-likelihood at convergence,
gradient: the gradient of the log-likelihood at convergence,
call: the matched call,
est.stat: some information about the estimation (time used, optimisation method),
freq: the frequency of choice,
residuals: the residuals,
fitted.values: the fitted values,
formula: the formula (a mFormula object),
expanded.formula: the formula (a formula object),
model: the model frame used,
index: the index of the choice and of the alternatives.

Yves Croissant

\insertRef

MCFA:73mlogit

\insertRef

MCFA:74mlogit

\insertRef

TRAI:09mlogit

mlogit.data() to shape the data. nnet::multinom() from package nnet performs the estimation of the multinomial logit model with individual specific variables. mlogit.optim() details about the optimization function.

## Cameron and Trivedi's Microeconometrics p.493 There are two
## alternative specific variables : price and catch one individual
## specific variable (income) and four fishing mode : beach, pier, boat,
## charter

data("Fishing", package = "mlogit")
Fish <- mlogit.data(Fishing, varying = c(2:9), shape = "wide", choice = "mode")

## a pure "conditional" model
summary(mlogit(mode ~ price + catch, data = Fish))

## a pure "multinomial model"
summary(mlogit(mode ~ 0 | income, data = Fish))

## which can also be estimated using multinom (package nnet)
library("nnet")
summary(multinom(mode ~ income, data = Fishing))

## a "mixed" model
m <- mlogit(mode ~ price+ catch | income, data = Fish)
summary(m)

## same model with charter as the reference level
m <- mlogit(mode ~ price+ catch | income, data = Fish, reflevel = "charter")

## same model with a subset of alternatives : charter, pier, beach
m <- mlogit(mode ~ price+ catch | income, data = Fish,
            alt.subset = c("charter", "pier", "beach"))

## model on unbalanced data i.e. for some observations, some
## alternatives are missing
# a data.frame in wide format with two missing prices
Fishing2 <- Fishing
Fishing2[1, "price.pier"] <- Fishing2[3, "price.beach"] <- NA
mlogit(mode~price+catch|income, Fishing2, shape="wide", choice="mode", varying = 2:9)

# a data.frame in long format with three missing lines
data("TravelMode", package = "AER")
Tr2 <- TravelMode[-c(2, 7, 9),]
mlogit(choice~wait+gcost|income+size, Tr2, shape = "long",
       chid.var = "individual", alt.var="mode", choice = "choice")

## An heteroscedastic logit model
data("TravelMode", package = "AER")
hl <- mlogit(choice ~ wait + travel + vcost, TravelMode,
             shape = "long", chid.var = "individual", alt.var = "mode",
             method = "bfgs", heterosc = TRUE, tol = 10)

## A nested logit model
TravelMode$avincome <- with(TravelMode, income * (mode == "air"))
TravelMode$time <- with(TravelMode, travel + wait)/60
TravelMode$timeair <- with(TravelMode, time * I(mode == "air"))
TravelMode$income <- with(TravelMode, income / 10)
# Hensher and Greene (2002), table 1 p.8-9 model 5
TravelMode$incomeother <- with(TravelMode, ifelse(mode %in% c('air', 'car'), income, 0))
nl <- mlogit(choice~gcost+wait+incomeother, TravelMode,
             shape='long', alt.var='mode',
             nests=list(public=c('train', 'bus'), other=c('car','air')))
# same with a comon nest elasticity (model 1)
nl2 <- update(nl, un.nest.el = TRUE)

## a probit model
## Not run: 
pr <- mlogit(choice ~ wait + travel + vcost, TravelMode,
             shape = "long", chid.var = "individual", alt.var = "mode",
             probit = TRUE)

## End(Not run)

## a mixed logit model
## Not run: 
rpl <- mlogit(mode ~ price+ catch | income, Fishing, varying = 2:9,
              shape = 'wide', rpar = c(price= 'n', catch = 'n'),
              correlation = TRUE, halton = NA,
              R = 10, tol = 10, print.level = 0)
summary(rpl)
rpar(rpl)
cor.mlogit(rpl)
cov.mlogit(rpl)
rpar(rpl, "catch")
summary(rpar(rpl, "catch"))

## End(Not run)

# a ranked ordered model
data("Game", package = "mlogit")
g <- mlogit(ch~own|hours, Game, choice='ch', varying = 1:12,
            ranked=TRUE, shape="wide", reflevel="PC")

Loading required package: Formula
Loading required package: zoo

Attaching package: 'zoo'

The following objects are masked from 'package:base':

    as.Date, as.Date.numeric

Loading required package: lmtest

Call:
mlogit(formula = mode ~ price + catch, data = Fish, method = "nr")

Frequencies of alternatives:
  beach    boat charter    pier 
0.11337 0.35364 0.38240 0.15059 

nr method
7 iterations, 0h:0m:0s 
g'(-H)^-1g = 6.22E-06 
successive function values within tolerance limits 

Coefficients :
                      Estimate Std. Error  z-value  Pr(>|z|)    
boat:(intercept)     0.8713749  0.1140428   7.6408 2.154e-14 ***
charter:(intercept)  1.4988884  0.1329328  11.2755 < 2.2e-16 ***
pier:(intercept)     0.3070552  0.1145738   2.6800 0.0073627 ** 
price               -0.0247896  0.0017044 -14.5444 < 2.2e-16 ***
catch                0.3771689  0.1099707   3.4297 0.0006042 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Log-Likelihood: -1230.8
McFadden R^2:  0.17823 
Likelihood ratio test : chisq = 533.88 (p.value = < 2.22e-16)

Call:
mlogit(formula = mode ~ 0 | income, data = Fish, method = "nr")

Frequencies of alternatives:
  beach    boat charter    pier 
0.11337 0.35364 0.38240 0.15059 

nr method
4 iterations, 0h:0m:0s 
g'(-H)^-1g = 8.32E-07 
gradient close to zero 

Coefficients :
                       Estimate  Std. Error z-value  Pr(>|z|)    
boat:(intercept)     7.3892e-01  1.9673e-01  3.7560 0.0001727 ***
charter:(intercept)  1.3413e+00  1.9452e-01  6.8955 5.367e-12 ***
pier:(intercept)     8.1415e-01  2.2863e-01  3.5610 0.0003695 ***
boat:income          9.1906e-05  4.0664e-05  2.2602 0.0238116 *  
charter:income      -3.1640e-05  4.1846e-05 -0.7561 0.4495908    
pier:income         -1.4340e-04  5.3288e-05 -2.6911 0.0071223 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Log-Likelihood: -1477.2
McFadden R^2:  0.013736 
Likelihood ratio test : chisq = 41.145 (p.value = 6.0931e-09)
# weights:  12 (6 variable)
initial  value 1638.599935 
iter  10 value 1477.150646
final  value 1477.150569 
converged
Call:
multinom(formula = mode ~ income, data = Fishing)

Coefficients:
        (Intercept)        income
pier      0.8141506 -1.434028e-04
boat      0.7389178  9.190824e-05
charter   1.3412901 -3.163844e-05

Std. Errors:
         (Intercept)       income
pier    5.816490e-09 2.668383e-05
boat    3.209473e-09 2.057825e-05
charter 3.921689e-09 2.116425e-05

Residual Deviance: 2954.301 
AIC: 2966.301 

Call:
mlogit(formula = mode ~ price + catch | income, data = Fish, 
    method = "nr")

Frequencies of alternatives:
  beach    boat charter    pier 
0.11337 0.35364 0.38240 0.15059 

nr method
7 iterations, 0h:0m:0s 
g'(-H)^-1g = 1.37E-05 
successive function values within tolerance limits 

Coefficients :
                       Estimate  Std. Error  z-value  Pr(>|z|)    
boat:(intercept)     5.2728e-01  2.2279e-01   2.3667 0.0179485 *  
charter:(intercept)  1.6944e+00  2.2405e-01   7.5624 3.952e-14 ***
pier:(intercept)     7.7796e-01  2.2049e-01   3.5283 0.0004183 ***
price               -2.5117e-02  1.7317e-03 -14.5042 < 2.2e-16 ***
catch                3.5778e-01  1.0977e-01   3.2593 0.0011170 ** 
boat:income          8.9440e-05  5.0067e-05   1.7864 0.0740345 .  
charter:income      -3.3292e-05  5.0341e-05  -0.6613 0.5084031    
pier:income         -1.2758e-04  5.0640e-05  -2.5193 0.0117582 *  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Log-Likelihood: -1215.1
McFadden R^2:  0.18868 
Likelihood ratio test : chisq = 565.17 (p.value = < 2.22e-16)

Call:
mlogit(formula = mode ~ price + catch | income, data = Fishing2,     shape = "wide", choice = "mode", varying = 2:9, method = "nr")

Coefficients:
   boat:(intercept)  charter:(intercept)     pier:(intercept)  
         5.2790e-01           1.6948e+00           7.7663e-01  
              price                catch          boat:income  
        -2.5110e-02           3.5768e-01           8.9122e-05  
     charter:income          pier:income  
        -3.3611e-05          -1.2700e-04  


Call:
mlogit(formula = choice ~ wait + gcost | income + size, data = Tr2,     shape = "long", chid.var = "individual", alt.var = "mode",     choice = "choice", method = "nr")

Coefficients:
train:(intercept)    bus:(intercept)    car:(intercept)               wait  
       -2.3115942         -3.4504941         -7.8913907         -0.1013180  
            gcost       train:income         bus:income         car:income  
       -0.0197064         -0.0589804         -0.0277037         -0.0041153  
       train:size           bus:size           car:size  
        1.3289497          1.0090796          1.0392585  


Call:
mlogit(formula = mode ~ price + catch | income, data = Fishing, 
    rpar = c(price = "n", catch = "n"), R = 10, correlation = TRUE, 
    halton = NA, varying = 2:9, shape = "wide", tol = 10, print.level = 0)

Frequencies of alternatives:
  beach    boat charter    pier 
0.11337 0.35364 0.38240 0.15059 

bfgs method
8 iterations, 0h:0m:4s 
g'(-H)^-1g =  1.38 
gradient close to zero 

Coefficients :
                       Estimate  Std. Error z-value  Pr(>|z|)    
boat:(intercept)     3.3352e-01  2.6789e-01  1.2450 0.2131383    
charter:(intercept)  2.1865e+00  3.0234e-01  7.2320 4.758e-13 ***
pier:(intercept)     7.4683e-01  2.1257e-01  3.5134 0.0004425 ***
price               -4.5563e-02  4.9365e-03 -9.2298 < 2.2e-16 ***
catch                2.9951e-01  1.6457e-01  1.8200 0.0687647 .  
boat:income          1.1049e-04  5.8810e-05  1.8787 0.0602850 .  
charter:income      -3.5671e-05  5.9329e-05 -0.6012 0.5476818    
pier:income         -1.2063e-04  4.7303e-05 -2.5501 0.0107696 *  
chol.price:price     2.4597e-02  4.0954e-03  6.0059 1.903e-09 ***
chol.price:catch     8.5960e-01  3.7819e-01  2.2729 0.0230295 *  
chol.catch:catch    -1.9487e-01  6.6237e-01 -0.2942 0.7686068    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Log-Likelihood: -1187.7
McFadden R^2:  0.20702 
Likelihood ratio test : chisq = 620.11 (p.value = < 2.22e-16)

random coefficients
      Min.     1st Qu.     Median       Mean     3rd Qu. Max.
price -Inf -0.06215297 -0.0455627 -0.0455627 -0.02897243  Inf
catch -Inf -0.29499868  0.2995051  0.2995051  0.89400894  Inf
$price
normal distribution with parameters -0.046 (mean) and 0.025 (sd)

$catch
normal distribution with parameters 0.3 (mean) and 0.881 (sd)

          price     catch
price 1.0000000 0.9752541
catch 0.9752541 1.0000000
            price      catch
price 0.000605001 0.02114342
catch 0.021143415 0.77688829
normal distribution with parameters 0.3 (mean) and 0.881 (sd)
      Min.    1st Qu.     Median       Mean    3rd Qu.       Max. 
      -Inf -0.2949987  0.2995051  0.2995051  0.8940089        Inf