View source: R/family.categorical.R

multinomial | R Documentation |

Fits a multinomial logit model (MLM) to a (preferably unordered) factor response.

```
multinomial(zero = NULL, parallel = FALSE, nointercept = NULL,
refLevel = "(Last)", imethod = 1, imu = NULL,
byrow.arg = FALSE, whitespace = FALSE)
```

`zero` |
Can be an integer-valued vector specifying which
linear/additive predictors are modelled as intercepts only.
Any values must be from the set {1,2,..., |

`parallel` |
A logical, or formula specifying which terms have equal/unequal coefficients. |

`nointercept, whitespace` |
See |

`imu, byrow.arg` |
See |

`refLevel` |
Either a (1) single positive integer or (2) a value of
the factor or (3) a character string.
If inputted as an integer then it specifies which
column of the response matrix is the reference or baseline level.
The default is the |

`imethod` |
Choosing 2 will use the mean sample proportions of each
column of the response matrix, which corresponds to
the MLEs for intercept-only models.
See |

In this help file the response `Y`

is assumed to be
a factor with unordered values `1,2,\dots,M+1`

, so
that `M`

is the number of linear/additive predictors
`\eta_j`

.

The default model can be written

`\eta_j = \log(P[Y=j]/ P[Y=M+1])`

where `\eta_j`

is the `j`

th
linear/additive predictor.
Here, `j=1,\ldots,M`

, and `\eta_{M+1}`

is 0 by definition. That is, the last level of the factor,
or last column of the response matrix, is taken as the
reference level or baseline—this is for identifiability
of the parameters. The reference or baseline level can
be changed with the `refLevel`

argument.

In almost all the literature, the constraint matrices
associated with this family of models are known.
For example, setting `parallel = TRUE`

will make all
constraint matrices
(including the intercept)
equal to a vector of `M`

1's;
to suppress the intercepts from being parallel then
set `parallel = FALSE ~ 1`

.
If the constraint matrices are
unknown and to be estimated, then this can be achieved by
fitting the model as a reduced-rank vector generalized
linear model (RR-VGLM; see `rrvglm`

).
In particular, a multinomial logit model with unknown
constraint matrices is known as a *stereotype*
model (Anderson, 1984), and can be fitted with
`rrvglm`

.

The above details correspond to the ordinary MLM where
all the levels are *altered* (in the terminology
of GAITD regression).

An object of class `"vglmff"`

(see `vglmff-class`

).
The object is used by modelling functions
such as `vglm`

,
`rrvglm`

and `vgam`

.

No check is made to verify that the response is nominal.

See `CommonVGAMffArguments`

for more warnings.

The response should be either a matrix of counts
(with row sums that are all positive), or a
factor. In both cases, the `y`

slot returned by
`vglm`

/`vgam`

/`rrvglm`

is the matrix of sample proportions.

The multinomial logit model is more appropriate for a nominal
(unordered) factor response than for an
ordinal (ordered) factor
response.
Models more suited for the latter include those based on
cumulative probabilities, e.g., `cumulative`

.

`multinomial`

is prone to numerical difficulties if
the groups are separable and/or the fitted probabilities
are close to 0 or 1. The fitted values returned
are estimates of the probabilities `P[Y=j]`

for
`j=1,\ldots,M+1`

. See safeBinaryRegression
for the logistic regression case.

Here is an example of the usage of the `parallel`

argument. If there are covariates `x2`

, `x3`

and `x4`

, then ```
parallel = TRUE ~ x2 + x3 -
1
```

and `parallel = FALSE ~ x4`

are equivalent. This
would constrain the regression coefficients for `x2`

and `x3`

to be equal; those of the intercepts and
`x4`

would be different.

In Example 4 below, a conditional logit model is
fitted to an artificial data set that explores how
cost and travel time affect people's decision about
how to travel to work. Walking is the baseline group.
The variable `Cost.car`

is the difference between
the cost of travel to work by car and walking, etc. The
variable `Time.car`

is the difference between
the travel duration/time to work by car and walking,
etc. For other details about the `xij`

argument see
`vglm.control`

and `fill1`

.

The `multinom`

function in the
nnet package uses the first level of the factor as
baseline, whereas the last level of the factor is used
here. Consequently the estimated regression coefficients
differ.

Thomas W. Yee

Agresti, A. (2013).
*Categorical Data Analysis*,
3rd ed. Hoboken, NJ, USA: Wiley.

Anderson, J. A. (1984).
Regression and ordered categorical variables.
*Journal of the Royal Statistical Society, Series B,
Methodological*,
**46**, 1–30.

Hastie, T. J., Tibshirani, R. J. and Friedman, J. H. (2009).
*The Elements of Statistical Learning: Data Mining,
Inference and Prediction*,
2nd ed.
New York, USA: Springer-Verlag.

McCullagh, P. and Nelder, J. A. (1989).
*Generalized Linear Models*, 2nd ed.
London: Chapman & Hall.

Tutz, G. (2012).
*Regression for Categorical Data*,
Cambridge: Cambridge University Press.

Yee, T. W. and Hastie, T. J. (2003).
Reduced-rank vector generalized linear models.
*Statistical Modelling*,
**3**, 15–41.

Yee, T. W. (2010).
The VGAM package for categorical data analysis.
*Journal of Statistical Software*,
**32**, 1–34.
\Sexpr[results=rd]{tools:::Rd_expr_doi("10.18637/jss.v032.i10")}.

Yee, T. W. and Ma, C. (2023)
Generally altered, inflated, truncated and deflated regression.
*In preparation*.

`multilogitlink`

,
`margeff`

,
`cumulative`

,
`acat`

,
`cratio`

,
`sratio`

,
`dirichlet`

,
`dirmultinomial`

,
`rrvglm`

,
`fill1`

,
`Multinomial`

,
`gaitdpoisson`

,
`Gaitdpois`

,
`iris`

.

```
# Example 1: fit a MLM to Edgar Anderson's iris data
data(iris)
## Not run: fit <- vglm(Species ~ ., multinomial, iris)
coef(fit, matrix = TRUE)
## End(Not run)
# Example 2a: a simple example
ycounts <- t(rmultinom(10, size = 20, prob = c(0.1, 0.2, 0.8)))
fit <- vglm(ycounts ~ 1, multinomial)
head(fitted(fit)) # Proportions
fit@prior.weights # NOT recommended for the prior weights
weights(fit, type = "prior", matrix = FALSE) # The better method
depvar(fit) # Sample proportions; same as fit@y
constraints(fit) # Constraint matrices
# Example 2b: Different reference level used as the baseline
fit2 <- vglm(ycounts ~ 1, multinomial(refLevel = 2))
coef(fit2, matrix = TRUE)
coef(fit , matrix = TRUE) # Easy to reconcile this output with fit2
# Example 3: The response is a factor.
nn <- 10
dframe3 <- data.frame(yfac = gl(3, nn, labels = c("Ctrl",
"Trt1", "Trt2")),
x2 = runif(3 * nn))
myrefLevel <- with(dframe3, yfac[12])
fit3a <- vglm(yfac ~ x2, multinomial(refLevel = myrefLevel), dframe3)
fit3b <- vglm(yfac ~ x2, multinomial(refLevel = 2), dframe3)
coef(fit3a, matrix = TRUE) # "Trt1" is the reference level
coef(fit3b, matrix = TRUE) # "Trt1" is the reference level
margeff(fit3b)
# Example 4: Fit a rank-1 stereotype model
fit4 <- rrvglm(Country ~ Width + Height + HP, multinomial, car.all)
coef(fit4) # Contains the C matrix
constraints(fit4)$HP # The A matrix
coef(fit4, matrix = TRUE) # The B matrix
Coef(fit4)@C # The C matrix
concoef(fit4) # Better to get the C matrix this way
Coef(fit4)@A # The A matrix
svd(coef(fit4, matrix = TRUE)[-1, ])$d # Has rank 1; = C %*% t(A)
# Classification (but watch out for NAs in some of the variables):
apply(fitted(fit4), 1, which.max) # Classification
# Classification:
colnames(fitted(fit4))[apply(fitted(fit4), 1, which.max)]
apply(predict(fit4, car.all, type = "response"),
1, which.max) # Ditto
# Example 5: Using the xij argument (aka conditional logit model)
set.seed(111)
nn <- 100 # Number of people who travel to work
M <- 3 # There are M+1 models of transport to go to work
ycounts <- matrix(0, nn, M+1)
ycounts[cbind(1:nn, sample(x = M+1, size = nn, replace = TRUE))] = 1
dimnames(ycounts) <- list(NULL, c("bus","train","car","walk"))
gotowork <- data.frame(cost.bus = runif(nn), time.bus = runif(nn),
cost.train= runif(nn), time.train= runif(nn),
cost.car = runif(nn), time.car = runif(nn),
cost.walk = runif(nn), time.walk = runif(nn))
gotowork <- round(gotowork, digits = 2) # For convenience
gotowork <- transform(gotowork,
Cost.bus = cost.bus - cost.walk,
Cost.car = cost.car - cost.walk,
Cost.train = cost.train - cost.walk,
Cost = cost.train - cost.walk, # for labelling
Time.bus = time.bus - time.walk,
Time.car = time.car - time.walk,
Time.train = time.train - time.walk,
Time = time.train - time.walk) # for labelling
fit <- vglm(ycounts ~ Cost + Time,
multinomial(parall = TRUE ~ Cost + Time - 1),
xij = list(Cost ~ Cost.bus + Cost.train + Cost.car,
Time ~ Time.bus + Time.train + Time.car),
form2 = ~ Cost + Cost.bus + Cost.train + Cost.car +
Time + Time.bus + Time.train + Time.car,
data = gotowork, trace = TRUE)
head(model.matrix(fit, type = "lm")) # LM model matrix
head(model.matrix(fit, type = "vlm")) # Big VLM model matrix
coef(fit)
coef(fit, matrix = TRUE)
constraints(fit)
summary(fit)
max(abs(predict(fit) - predict(fit, new = gotowork))) # Should be 0
```

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.