# mlogit: Multinomial logit model In mlogit: Multinomial Logit Models

## Description

Estimation by maximum likelihood of the multinomial logit model, with alternative-specific and/or individual specific variables.

## Usage

 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24``` ```mlogit( formula, data, subset, weights, na.action, start = NULL, alt.subset = NULL, reflevel = NULL, nests = NULL, un.nest.el = FALSE, unscaled = FALSE, heterosc = FALSE, rpar = NULL, probit = FALSE, R = 40, correlation = FALSE, halton = NULL, random.nb = NULL, panel = FALSE, estimate = TRUE, seed = 10, ... ) ```

## Arguments

 `formula` a symbolic description of the model to be estimated, `data` the data: an `mlogit.data` object or an ordinary `data.frame`, `subset` an optional vector specifying a subset of observations for `mlogit`, `weights` an optional vector of weights, `na.action` a function which indicates what should happen when the data contains `NA`s, `start` a vector of starting values, `alt.subset` a vector of character strings containing the subset of alternative on which the model should be estimated, `reflevel` the base alternative (the one for which the coefficients of individual-specific variables are normalized to 0), `nests` a named list of characters vectors, each names being a nest, the corresponding vector being the set of alternatives that belong to this nest, `un.nest.el` a boolean, if `TRUE`, the hypothesis of unique elasticity is imposed for nested logit models, `unscaled` a boolean, if `TRUE`, the unscaled version of the nested logit model is estimated, `heterosc` a boolean, if `TRUE`, the heteroscedastic logit model is estimated, `rpar` a named vector whose names are the random parameters and values the distribution : `'n'` for normal, `'l'` for log-normal, `'t'` for truncated normal, `'u' ` for uniform, `probit` if `TRUE`, a multinomial porbit model is estimated, `R` the number of function evaluation for the gaussian quadrature method used if `heterosc = TRUE`, the number of draws of pseudo-random numbers if `rpar` is not `NULL`, `correlation` only relevant if `rpar` is not `NULL`, if true, the correlation between random parameters is taken into account, `halton` only relevant if `rpar` is not `NULL`, if not `NULL`, halton sequence is used instead of pseudo-random numbers. If `halton = NA`, some default values are used for the prime of the sequence (actually, the primes are used in order) and for the number of elements droped. Otherwise, `halton` should be a list with elements `prime` (the primes used) and `drop` (the number of elements droped). `random.nb` only relevant if `rpar` is not `NULL`, a user-supplied matrix of random, `panel` only relevant if `rpar` is not `NULL` and if the data are repeated observations of the same unit ; if `TRUE`, the mixed-logit model is estimated using panel techniques, `estimate` a boolean indicating whether the model should be estimated or not: if not, the `model.frame` is returned, `seed` the seed to use for random numbers (for mixed logit and probit models), `...` further arguments passed to `mlogit.data` or `mlogit.optim`.

## Details

For how to use the formula argument, see `Formula()`.

The `data` argument may be an ordinary `data.frame`. In this case, some supplementary arguments should be provided and are passed to `mlogit.data()`. Note that it is not necessary to indicate the choice argument as it is deduced from the formula.

The model is estimated using the `mlogit.optim()`. function.

The basic multinomial logit model and three important extentions of this model may be estimated.

If `heterosc=TRUE`, the heteroscedastic logit model is estimated. `J - 1` extra coefficients are estimated that represent the scale parameter for `J - 1` alternatives, the scale parameter for the reference alternative being normalized to 1. The probabilities don't have a closed form, they are estimated using a gaussian quadrature method.

If `nests` is not `NULL`, the nested logit model is estimated.

If `rpar` is not `NULL`, the random parameter model is estimated. The probabilities are approximated using simulations with `R` draws and halton sequences are used if `halton` is not `NULL`. Pseudo-random numbers are drawns from a standard normal and the relevant transformations are performed to obtain numbers drawns from a normal, log-normal, censored-normal or uniform distribution. If `correlation = TRUE`, the correlation between the random parameters are taken into account by estimating the components of the cholesky decomposition of the covariance matrix. With G random parameters, without correlation G standard deviations are estimated, with correlation G * (G + 1) /2 coefficients are estimated.

## Value

An object of class `"mlogit"`, a list with elements:

• coefficients: the named vector of coefficients,

• logLik: the value of the log-likelihood,

• hessian: the hessian of the log-likelihood at convergence,

• call: the matched call,

• est.stat: some information about the estimation (time used, optimisation method),

• freq: the frequency of choice,

• residuals: the residuals,

• fitted.values: the fitted values,

• formula: the formula (a `Formula` object),

• expanded.formula: the formula (a `formula` object),

• model: the model frame used,

• index: the index of the choice and of the alternatives.

Yves Croissant

## References

\insertRef

MCFA:73mlogit

\insertRef

MCFA:74mlogit

\insertRef

TRAI:09mlogit

`mlogit.data()` to shape the data. `nnet::multinom()` from package `nnet` performs the estimation of the multinomial logit model with individual specific variables. `mlogit.optim()` details about the optimization function.
 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81``` ```## Cameron and Trivedi's Microeconometrics p.493 There are two ## alternative specific variables : price and catch one individual ## specific variable (income) and four fishing mode : beach, pier, boat, ## charter data("Fishing", package = "mlogit") Fish <- dfidx(Fishing, varying = 2:9, shape = "wide", choice = "mode") ## a pure "conditional" model summary(mlogit(mode ~ price + catch, data = Fish)) ## a pure "multinomial model" summary(mlogit(mode ~ 0 | income, data = Fish)) ## which can also be estimated using multinom (package nnet) summary(nnet::multinom(mode ~ income, data = Fishing)) ## a "mixed" model m <- mlogit(mode ~ price + catch | income, data = Fish) summary(m) ## same model with charter as the reference level m <- mlogit(mode ~ price + catch | income, data = Fish, reflevel = "charter") ## same model with a subset of alternatives : charter, pier, beach m <- mlogit(mode ~ price + catch | income, data = Fish, alt.subset = c("charter", "pier", "beach")) ## model on unbalanced data i.e. for some observations, some ## alternatives are missing # a data.frame in wide format with two missing prices Fishing2 <- Fishing Fishing2[1, "price.pier"] <- Fishing2[3, "price.beach"] <- NA mlogit(mode ~ price + catch | income, Fishing2, shape = "wide", varying = 2:9) # a data.frame in long format with three missing lines data("TravelMode", package = "AER") Tr2 <- TravelMode[-c(2, 7, 9),] mlogit(choice ~ wait + gcost | income + size, Tr2) ## An heteroscedastic logit model data("TravelMode", package = "AER") hl <- mlogit(choice ~ wait + travel + vcost, TravelMode, heterosc = TRUE) ## A nested logit model TravelMode\$avincome <- with(TravelMode, income * (mode == "air")) TravelMode\$time <- with(TravelMode, travel + wait)/60 TravelMode\$timeair <- with(TravelMode, time * I(mode == "air")) TravelMode\$income <- with(TravelMode, income / 10) # Hensher and Greene (2002), table 1 p.8-9 model 5 TravelMode\$incomeother <- with(TravelMode, ifelse(mode %in% c('air', 'car'), income, 0)) nl <- mlogit(choice ~ gcost + wait + incomeother, TravelMode, nests = list(public = c('train', 'bus'), other = c('car','air'))) # same with a comon nest elasticity (model 1) nl2 <- update(nl, un.nest.el = TRUE) ## a probit model ## Not run: pr <- mlogit(choice ~ wait + travel + vcost, TravelMode, probit = TRUE) ## End(Not run) ## a mixed logit model ## Not run: rpl <- mlogit(mode ~ price + catch | income, Fishing, varying = 2:9, rpar = c(price= 'n', catch = 'n'), correlation = TRUE, alton = NA, R = 50) summary(rpl) rpar(rpl) cor.mlogit(rpl) cov.mlogit(rpl) rpar(rpl, "catch") summary(rpar(rpl, "catch")) ## End(Not run) # a ranked ordered model data("Game", package = "mlogit") g <- mlogit(ch ~ own | hours, Game, varying = 1:12, ranked = TRUE, reflevel = "PC", idnames = c("chid", "alt")) ```