gmnl: Estimate Multinomial Logit Models with Observed and...

Description Usage Arguments Details Value Author(s) References See Also Examples

View source: R/gmnl.R

Description

Estimate different types of multinomial logit models with observed and unobserved individual heterogneity, such as MIXL, S-MNL, G-MNL, LC and MM-MNL models. These models are estimated using Maximum Simulated Likelihood. It supports both cross-sectional and panel data.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
gmnl(
  formula,
  data,
  subset,
  weights,
  na.action,
  model = c("mnl", "mixl", "smnl", "gmnl", "lc", "mm"),
  start = NULL,
  ranp = NULL,
  R = 40,
  Q = 2,
  haltons = NA,
  mvar = NULL,
  seed = 12345,
  correlation = FALSE,
  bound.err = 2,
  panel = FALSE,
  hgamma = c("direct", "indirect"),
  reflevel = NULL,
  init.tau = 0.1,
  init.gamma = 0.1,
  notscale = NULL,
  print.init = FALSE,
  gradient = TRUE,
  typeR = TRUE,
  ...
)

## S3 method for class 'gmnl'
print(
  x,
  digits = max(3, getOption("digits") - 3),
  width = getOption("width"),
  ...
)

## S3 method for class 'gmnl'
summary(object, ...)

## S3 method for class 'summary.gmnl'
print(
  x,
  digits = max(3, getOption("digits") - 2),
  width = getOption("width"),
  ...
)

## S3 method for class 'gmnl'
update(object, new, ...)

## S3 method for class 'gmnl'
coef(object, ...)

## S3 method for class 'gmnl'
model.matrix(object, ...)

## S3 method for class 'gmnl'
residuals(object, outcome = TRUE, ...)

## S3 method for class 'gmnl'
df.residual(object, ...)

## S3 method for class 'gmnl'
fitted(object, outcome = TRUE, ...)

## S3 method for class 'gmnl'
logLik(object, ...)

## S3 method for class 'gmnl'
nObs(x, ...)

Arguments

formula

a symbolic description of the model to be estimated. The formula is divided in five parts, each of them separated by the symbol |. The first part is reserved for alternative-specific variables with a generic coefficient. The second part corresponds to individual-specific variables with an alternative specific coefficients. The third part corresponds to alternative-specific variables with an alternative-specific coefficident. The fourth part is reserved for time-invariant variables that modify the mean of the random parameters. Finally, the fifth part is reserved for time-invariant variables that enter in the scale coefficient or in the probability assignment in models with latent classes.

data

the data of class mlogit.data.

subset

an optional vector specifying a subset of observations.

weights

an optional vector of weights. Default to 1.

na.action

a function wich indicated what should happen when the data contains NA's.

model

a string indicating which model is estimated. The options are "mnl" for the Multinomial Logit Model, "mixl" for the Mixed Logit Model, "smnl" for the Scaled Multinomial Logit Model, "gmnl" for the Generalized Multinomial Logit Model, "lc" for the Latent Class Multinomial Logit Model, and "mm" for the Mixed-Mixed Multinomial Logit Model.

start

a vector of starting values.

ranp

a named vector whose names are the random parameters and values the distribution: "n" for normal, "ln" for log-normal, "cn" for truncated normal, "u" for uniform, "t" for triangular, "sb" for Sb Johnson.

R

the number of draws of pseudo-random numbers if ranp is not NULL.

Q

number of classes for LC or MM-MNL models.

haltons

only relevant if ranp is not NULL. If haltons = NULL, pseudo-random numbers are used instead of Halton sequences. If haltons=NA, the first K primes are used to generates the Halton draws, where K is the number of random parameters, and 15 of the initial sequence of elements are dropped. Otherwise, haltons should be a list with elements prime and drop.

mvar

only valid if ranp is not NULL. This is a named list, where the names correspond to the variables with random parameters, and the values correspond to the variables that enter in the mean of each random parameters.

seed

seed for the random number generator. Default is seed = 12345.

correlation

only relevant if ranp is not NULL. If true, the correlation across random parameters is taken into account.

bound.err

only relevenat if model is smnl or gmnl. It indicates at which values the draws for the scale parameter are truncated. By default bound.err = 2, therefore a truncated normal distribution with truncation at -2 and +2 is used.

panel

if TRUE a panel data model is estimated.

hgamma

a string indicated how to estimate the parameter gamma. If "direct", then γ is estimated directly, if "indirect" then γ ^* is estimated, where γ = \exp(γ^*)/(1 + \exp(γ^*)).

reflevel

the base alternative.

init.tau

initial value for the τ parameter.

init.gamma

initial value for γ.

notscale

only relevant if model is smnl or gmnl. It is a vector indicating which variables should not be scaled.

print.init

if TRUE, the initial values for the optimization procedure are printed.

gradient

if TRUE, analytical gradients are used for the optimization procedure.

typeR

if TRUE, truncated normal draws are used for the scale parameter, if FALSE the procedure suggested by Greene (2010) is used.

...

additional arguments to be passed to maxLik, which depend in the maximization routine.

x, object

and object of class gmnl.

digits

the number of digits.

width

width.

new

an updated formula for the update method.

outcome

if TRUE, then the fitted and residuals methods return a vector that corresponds to the chosen alternative, otherwise it returns a matrix where each column corresponds to each alternative.

Details

Let the utility to person i from choosing alternative j on choice occasion t be:

U_{ijt} = β_{i}x_{ijt} + ε_{ijt}

where ε_{ijt} is i.i.d extreme value, and β_i vary across individuals. Each model estimated by gmnl depends on how β_i is specified. The options are the following:

  1. S-MNL if β_i=σ_iβ, where the scale σ_i varies across individuals.

  2. MIXL if β_i=β + sη_i, where η_i is a draw from some distribution. For example, if β_i\sim N(β, s^2), then η_i\sim N(0, 1).

  3. GMNL if β_i=σ_iβ + γ sη_i + σ_i(1-γ)sη_i, where σ_i is the scale parameter, and γ is a parameter that controls how the variance of residual taste heterogeneity varies with scale.

  4. LC if β_i=β_q with probability w_{iq} for q = 1,...,Q, where Q is the total number of classes.

  5. MM-MIXL if β_i\sim f(β_q, Σ_q) with probability w_{iq} for q = 1,...,Q, where Q is the total number of classes.

Observed heterogeneity can be also accommodated in the random parameters when the MIXL is estimated by including individual-specific covariates. Specifically, the vector of random coefficients is

β_i=β +Π z_i + Lη_i

where z_i is a set of characteristics of individual i that influence the mean of the taste parameters; and Π is matrix of parameters. To estimate this model, the fourth part of the formula should be specified along with the mvar argument.

One can also allow the mean of the scale to differ across individuals by including individual-specific characteristics. Thus, the scale parameters can be written as

\exp(\bar{σ} + δ h_i + τ υ_i)

where h_i is a vector of attributes of individual i. To estimate this model, the fifth part of the formula should include the variables that enter h_i.

For models with latent classes, the class assignment is modeled as a semi-parametric multinomial logit format

w_{iq}= \frac{\exp(γ_q)}{∑_{q=1}^Q\exp(γ_q)}

for q = 1,...,Q, γ_1 = 0. Latent class models (LC and MM-MIXL) requires at least that a constant should be specified in the fifth part of the formula. If the class assignment, w_{iq}, is also determined by socio-economic characteristics, these variables can be also included in the fifth part.

Models that involve random parameters are estimated using Maximum Simulated Likelihood using the maxLik function of maxLik package.

Value

An object of class “gmnl” with the following elements

coefficients

the named vector of coefficients,

logLik

a set of values of the maximum likelihood procedure,

mf

the model framed used,

formula

the formula (a gFormula object),

time

proc.time() minus the start time,

freq

frequency of dependent variable,

draws

type of draws used,

model

the fitted model,

R

number of draws used,

ranp

vector indicating the variables with random parameters and their distribution,

residuals

the residuals,

correlation

whether the model is fitted assuming that the random parameters are correlated,

bi

matrix of conditional expectation of random parameters,

Q

number of classes,

call

the matched call.

Author(s)

Mauricio Sarrias msarrias86@gmail.com

References

See Also

mlogit, mlogit.data, maxLik, Rchoice

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
## Examples using the Fishing data set from the AER package
data("TravelMode", package = "AER")
library(mlogit)
TM <- mlogit.data(TravelMode, choice = "choice", shape = "long", 
                 alt.levels = c("air", "train", "bus", "car"), chid.var = "individual")
## Not run: 
## S-MNL model, ASCs not scaled
smnl <- gmnl(choice ~ wait + vcost + travel + gcost| 1, data = TM, 
             model = "smnl", R = 100, 
             notscale = c(1, 1, 1, rep(0, 4)))
summary(smnl)

## MIXL model with observed heterogeneity
mixl.hier <- gmnl(choice ~ vcost + gcost + travel + wait | 1 | 0 | income + size - 1,
                 data = TM,
                 model = "mixl",
                 ranp = c(travel = "t", wait = "n"),
                 mvar = list(travel = c("income","size"), wait = c("income")),
                 R = 30,
                 haltons = list("primes"= c(2, 17), "drop" = rep(19, 2)))
summary(mixl.hier)

## Examples using the Electricity data set from the mlogit package
data("Electricity", package = "mlogit")
Electr <- mlogit.data(Electricity, id.var = "id", choice = "choice",
                     varying = 3:26, shape = "wide", sep = "")
                     
## Estimate a MIXL model with correlated random parameters
Elec.cor <- gmnl(choice ~ pf + cl + loc + wk + tod + seas| 0, data = Electr,
                 subset = 1:3000,
                 model = 'mixl',
                 R = 10,
                 panel = TRUE,
                 ranp = c(cl = "n", loc = "n", wk = "n", tod = "n", seas = "n"),
                 correlation = TRUE)
summary(Elec.cor)
cov.gmnl(Elec.cor)
se.cov.gmnl(Elec.cor)
se.cov.gmnl(Elec.cor, sd = TRUE)
cor.gmnl(Elec.cor)

## Estimate a G-MNL model, where ASCs are also random
Electr$asc2 <- as.numeric(Electr$alt == 2)
Electr$asc3 <- as.numeric(Electr$alt == 3)
Electr$asc4 <- as.numeric(Electr$alt == 4)

Elec.gmnl <- gmnl(choice ~ pf + cl + loc + wk + tod + seas + asc2 + asc3 + asc4 | 0,
                 data = Electr,
                 subset = 1:3000,
                 model = 'gmnl',
                 R = 30,
                 panel = TRUE,
                 notscale = c(rep(0, 6), 1, 1, 1),
                 ranp = c(cl = "n", loc = "n", wk = "n", tod = "n", seas = "n",
                 asc2 = "n", asc3 = "n", asc4 = "n"))
summary(Elec.gmnl)

## Estimate a LC model with 2 classes
Elec.lc <- gmnl(choice ~ pf + cl + loc + wk + tod + seas| 0 | 0 | 0 | 1,
               data = Electr,
               subset = 1:3000,
               model = 'lc',
               panel = TRUE,
               Q = 2)
summary(Elec.lc)

## Estimate a MM-MIXL model
Elec.mm <- gmnl(choice ~ pf + cl + loc + wk + tod + seas| 0 | 0 | 0 | 1,
                 data = Electr,
                 subset = 1:3000,
                 model = 'mm',
                 R = 30,
                 panel = TRUE,
                 ranp = c(pf = "n", cl = "n", loc = "n", wk = "n", tod = "n",
                 seas = "n"),
                 Q = 2,
                 iterlim = 500)
summary(Elec.mm)

## End(Not run)

gmnl documentation built on July 1, 2020, 6:01 p.m.

Related to gmnl in gmnl...