# mlogit: Multinomial logit model In mlogit: Multinomial Logit Models

## Description

Estimation by maximum likelihood of the multinomial logit model, with alternative-specific and/or individual specific variables.

## Usage

 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41``` ```mlogit(formula, data, subset, weights, na.action, start = NULL, alt.subset = NULL, reflevel = NULL, nests = NULL, un.nest.el = FALSE, unscaled = FALSE, heterosc = FALSE, rpar = NULL, probit = FALSE, R = 40, correlation = FALSE, halton = NULL, random.nb = NULL, panel = FALSE, estimate = TRUE, seed = 10, ...) ## S3 method for class 'mlogit' print(x, digits = max(3, getOption("digits") - 2), width = getOption("width"), ...) ## S3 method for class 'mlogit' summary(object, ...) ## S3 method for class 'summary.mlogit' print(x, digits = max(3, getOption("digits") - 2), width = getOption("width"), ...) ## S3 method for class 'mlogit' print(x, digits = max(3, getOption("digits") - 2), width = getOption("width"), ...) ## S3 method for class 'mlogit' logLik(object, ...) ## S3 method for class 'mlogit' residuals(object, outcome = TRUE, ...) ## S3 method for class 'mlogit' fitted(object, type = c("outcome", "probabilities", "linpred", "parameters"), outcome = NULL, ...) ## S3 method for class 'mlogit' predict(object, newdata, returnData = FALSE, ...) ## S3 method for class 'mlogit' df.residual(object, ...) ## S3 method for class 'mlogit' terms(x, ...) ## S3 method for class 'mlogit' model.matrix(object, ...) ## S3 method for class 'mlogit' update(object, new, ...) ## S3 method for class 'mlogit' coef(object, fixed = FALSE, ...) ## S3 method for class 'summary.mlogit' coef(object, ...) ```

## Arguments

 `x, object` an object of class `mlogit` `formula` a symbolic description of the model to be estimated, `new` an updated formula for the `update` method, `newdata` a `data.frame` for the `predict` method, `returnData` if `TRUE`, the data is returned as an attribute, `data` the data: an `mlogit.data` object or an ordinary `data.frame`, `subset` an optional vector specifying a subset of observations, `weights` an optional vector of weights, `na.action` a function which indicates what should happen when the data contains '`NA`'s, `start` a vector of starting values, `alt.subset` a vector of character strings containing the subset of alternative on which the model should be estimated, `reflevel` the base alternative (the one for which the coefficients of individual-specific variables are normalized to 0), `nests` a named list of characters vectors, each names being a nest, the corresponding vector being the set of alternatives that belong to this nest, `un.nest.el` a boolean, if `TRUE`, the hypothesis of unique elasticity is imposed for nested logit models, `unscaled` a boolean, if `TRUE`, the unscaled version of the nested logit model is estimated, `heterosc` a boolean, if `TRUE`, the heteroscedastic logit model is estimated, `rpar` a named vector whose names are the random parameters and values the distribution : `'n'` for normal, `'l'` for log-normal, `'t'` for truncated normal, `'u' ` for uniform, `probit` if `TRUE`, a multinomial porbit model is estimated, `R` the number of function evaluation for the gaussian quadrature method used if `heterosc=TRUE`, the number of draws of pseudo-random numbers if `rpar` is not `NULL`, `correlation` only relevant if `rpar` is not `NULL`, if true, the correlation between random parameters is taken into account, `halton` only relevant if `rpar` is not `NULL`, if not `NULL`, halton sequence is used instead of pseudo-random numbers. If `halton=NA`, some default values are used for the prime of the sequence (actually, the primes are used in order) and for the number of elements droped. Otherwise, `halton` should be a list with elements `prime` (the primes used) and `drop` (the number of elements droped). `random.nb` only relevant if `rpar` is not `NULL`, a user-supplied matrix of random, `panel` only relevant if `rpar` is not `NULL` and if the data are repeated observations of the same unit ; if `TRUE`, the mixed-logit model is estimated using panel techniques, `estimate` a boolean indicating whether the model should be estimated or not: if not, the `model.frame` is returned, `seed` , `digits` the number of digits, `width` the width of the printing, `outcome` a boolean which indicates, for the `fitted` and the `residuals` methods whether a matrix (for each choice, one value for each alternative) or a vector (for each choice, only a value for the alternative chosen) should be returned, `type` one of `outcome` (probability of the chosen alternative), `probabilities` (probabilities for all the alternatives), `parameters` for individual-level random parameters, `fixed` if `FALSE` (the default), constant coefficients are not returned, `...` further arguments passed to `mlogit.data` or `mlogit.optim`.

## Details

For how to use the formula argument, see `mFormula`.

The `data` argument may be an ordinary `data.frame`. In this case, some supplementary arguments should be provided and are passed to `mlogit.data`. Note that it is not necessary to indicate the choice argument as it is deduced from the formula.

The model is estimated using the `mlogit.optim` function.

The basic multinomial logit model and three important extentions of this model may be estimated.

If `heterosc=TRUE`, the heteroscedastic logit model is estimated. `J-1` extra coefficients are estimated that represent the scale parameter for `J-1` alternatives, the scale parameter for the reference alternative being normalized to 1. The probabilities don't have a closed form, they are estimated using a gaussian quadrature method.

If `nests` is not `NULL`, the nested logit model is estimated.

If `rpar` is not `NULL`, the random parameter model is estimated. The probabilities are approximated using simulations with `R` draws and halton sequences are used if `halton` is not `NULL`. Pseudo-random numbers are drawns from a standard normal and the relevant transformations are performed to obtain numbers drawns from a normal, log-normal, censored-normal or uniform distribution. If `correlation=TRUE`, the correlation between the random parameters are taken into account by estimating the components of the cholesky decomposition of the covariance matrix. With G random parameters, without correlation G standard deviations are estimated, with correlation G * (G + 1) /2 coefficients are estimated.

## Value

An object of class `"mlogit"`, a list with elements:

 `coefficients` the named vector of coefficients, `logLik` the value of the log-likelihood, `hessian` the hessian of the log-likelihood at convergence, `gradient` the gradient of the log-likelihood at convergence, `call` the matched call, `est.stat` some information about the estimation (time used, optimisation method), `freq` the frequency of choice, `residuals` the residuals, `fitted.values` the fitted values, `formula` the formula (a `mFormula` object), `expanded.formula` the formula (a `formula` object), `model` the model frame used, `index` the index of the choice and of the alternatives.

Yves Croissant

## References

McFadden, D. (1973) Conditional Logit Analysis of Qualitative Choice Behavior, in P. Zarembka ed., Frontiers in Econometrics, New-York: Academic Press.

McFadden, D. (1974) “The Measurement of Urban Travel Demand”, Journal of Public Economics, 3, pp. 303-328.

Train, K. (2004) Discrete Choice Modelling, with Simulations, Cambridge University Press.

`mlogit.data` to shape the data. `multinom` from package `nnet` performs the estimation of the multinomial logit model with individual specific variables. `mlogit.optim` for details about the optimization function.

## Examples

 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100``` ```## Cameron and Trivedi's Microeconometrics p.493 There are two ## alternative specific variables : price and catch one individual ## specific variable (income) and four fishing mode : beach, pier, boat, ## charter data("Fishing", package = "mlogit") Fish <- mlogit.data(Fishing, varying = c(2:9), shape = "wide", choice = "mode") ## a pure "conditional" model summary(mlogit(mode ~ price + catch, data = Fish)) ## a pure "multinomial model" summary(mlogit(mode ~ 0 | income, data = Fish)) ## which can also be estimated using multinom (package nnet) library("nnet") summary(multinom(mode ~ income, data = Fishing)) ## a "mixed" model m <- mlogit(mode ~ price+ catch | income, data = Fish) summary(m) ## same model with charter as the reference level m <- mlogit(mode ~ price+ catch | income, data = Fish, reflevel = "charter") ## same model with a subset of alternatives : charter, pier, beach m <- mlogit(mode ~ price+ catch | income, data = Fish, alt.subset = c("charter", "pier", "beach")) ## model on unbalanced data i.e. for some observations, some ## alternatives are missing # a data.frame in wide format with two missing prices Fishing2 <- Fishing Fishing2[1, "price.pier"] <- Fishing2[3, "price.beach"] <- NA mlogit(mode~price+catch|income, Fishing2, shape="wide", choice="mode", varying = 2:9) # a data.frame in long format with three missing lines data("TravelMode", package = "AER") Tr2 <- TravelMode[-c(2, 7, 9),] mlogit(choice~wait+gcost|income+size, Tr2, shape = "long", chid.var = "individual", alt.var="mode", choice = "choice") ## An heteroscedastic logit model data("TravelMode", package = "AER") hl <- mlogit(choice ~ wait + travel + vcost, TravelMode, shape = "long", chid.var = "individual", alt.var = "mode", method = "bfgs", heterosc = TRUE, tol = 10) ## A nested logit model TravelMode\$avincome <- with(TravelMode, income * (mode == "air")) TravelMode\$time <- with(TravelMode, travel + wait)/60 TravelMode\$timeair <- with(TravelMode, time * I(mode == "air")) TravelMode\$income <- with(TravelMode, income / 10) # Hensher and Greene (2002), table 1 p.8-9 model 5 TravelMode\$incomeother <- with(TravelMode, ifelse(mode %in% c('air', 'car'), income, 0)) nl <- mlogit(choice~gcost+wait+incomeother, TravelMode, shape='long', alt.var='mode', nests=list(public=c('train', 'bus'), other=c('car','air'))) # same with a comon nest elasticity (model 1) nl2 <- update(nl, un.nest.el = TRUE) ## a probit model ## Not run: pr <- mlogit(choice ~ wait + travel + vcost, TravelMode, shape = "long", chid.var = "individual", alt.var = "mode", probit = TRUE) ## End(Not run) ## a mixed logit model ## Not run: rpl <- mlogit(mode ~ price+ catch | income, Fishing, varying = 2:9, shape = 'wide', rpar = c(price= 'n', catch = 'n'), correlation = TRUE, halton = NA, R = 10, tol = 10, print.level = 0) summary(rpl) rpar(rpl) cor.mlogit(rpl) cov.mlogit(rpl) rpar(rpl, "catch") summary(rpar(rpl, "catch")) ## End(Not run) # a ranked ordered model data("Game", package = "mlogit") g <- mlogit(ch~own|hours, Game, choice='ch', varying = 1:12, ranked=TRUE, shape="wide", reflevel="PC") ```

mlogit documentation built on April 20, 2018, 5:03 p.m.