Description Usage Arguments Details Value Author(s) References Examples
multinomRob
fits the overdispersed multinomial regression model
for grouped count data using the hyperbolic tangent (tanh) and least quartile
difference (LQD) robust estimators.
1 2 3 4 |
model |
The regression model specification. This is a list of formulas, with one formula for each category of outcomes for which counts have been measured for each observation. For example, in the following,
the outcome variables containing counts are The set of outcome alternatives may be specified to vary over observations, by putting in a negative value for alternatives that do not exist for particular observations. If the value of an outcome variable is negative for an observation, then that outcome is considered not available for that observation. The predicted counts for that observation are defined only for the available observations and are based on the linear predictors for the available observations. The same set of coefficient parameter values are used for all observations. Any observation for which fewer than two outcomes are available is omitted. Observations with missing data ( In a model that has the same regressors for every category, except for
one category for which there are no regressors in order to identify the
model (the reference category), the |
data |
The dataframe that contains all the variables referenced in the
|
starting.values |
Starting values for the regression coefficient parameters, as a vector.
The parameter ordering matches the ordering of the formulas in the
|
equality |
List of equality constraints. This is a list of lists of formulas. Each formula has the same format as in the model specification, and must include only a subset of the outcomes and regressors used in the model specification formulas. All the coefficients specified by the formulas in each list will be constrained to have the same value during estimation. For example, in the following,
the model to be estimated is
and the coefficients of x1 and x2 are constrained equal by
In the equality formulas it is necessary to say |
genoud.parms |
List of named arguments used to control the rgenoud optimizer, which is used to compute the LQD estimator. |
print.level |
Specify 0 for minimal printing, 1 to print more detailed information about LQD and other intermediate computations, 2 to print details about the tanh computations, or 3 to print details about starting values computations. |
iter |
|
maxiter |
The maximum number of iterations to be done between LQD and tanh estimation steps. |
multinom.t |
If the |
multinom.t.df |
|
MLEonly |
If |
The tanh estimator is a redescending
M-estimator, and the LQD estimator is a generalized S-estimator. The LQD
is used to estimate the scale of the overdispersion. Given that scale
estimate, the tanh estimator is used to estimate the coefficient parameters
of the linear predictors of the multinomial regression model.
If starting values are not supplied, they are computed using a multinomial multivariate-t model. The program also computes and reports nonrobust maximum likelihood estimates for the multinomial regression model, reporting sandwich estimates for the standard errors that are adjusted for a nonrobust estimate of the error dispersion.
multinomRob returns a list of 15 objects. The returned objects are:
coefficients |
The tanh coefficient estimates in matrix format. The matrix has one
column for each formula specified in the |
se |
The tanh coefficient estimate standard errors in matrix format. The
format and labelling used for the matrix is the same as is used for the
|
LQDsigma2 |
The LQD dispersion (variance) parameter estimate. This is the LQD estimate of the scale value, squared. |
TANHsigma2 |
The tanh dispersion parameter estimate. |
weights |
The matrix of tanh weights for the orthogonalized residuals. The matrix
has one row for each observation in the data and as many columns as
there are formulas specified in the If an observation has negative values specified for some outcome variables,
indicating that those outcome alternatives are not available for that
observation, then values of |
Hdiag |
Weights used to fully studentize the orthogonalized residuals. The matrix
has one row for each observation in the data and as many columns as
there are formulas specified in the If an observation has negative values specified for some outcome variables, indicating that those outcome alternatives are not available for that observation, then values of 0 appear in the weights matrix for that observation, as many 0 values as there are unavailable alternatives. Values of 0 that are created for this reason will be the last values in the affected row of the weights matrix, regardless of which outcome alternatives were unavailable for the observation. |
prob |
The matrix of predicted probabilities for each category for each observation based on the tanh coefficient estimates. |
residuals.rotate |
Matrix of studentized residuals which have been made comparable by rotating each choice category to the first position. These residuals, unlike the student and standard residuals below, are no longer orthogonalized because of the rotation. These are the residuals displayed in Table 6 of the reference article. |
residuals.student |
Matrix of fully studentized orthogonalized residuals. |
residuals.standard |
Matrix of orthogonalized residuals, standardized by dividing by the overdispersion scale. |
mnl |
List of nonrobust maximum likelihood estimation results from function
|
multinomT |
List of multinomial multivariate-t estimation results from function
|
genoud |
List of LQD estimation results obtained by rgenoud optimization, from
function |
mtanh |
List of tanh estimation results from function |
error |
Exit error code, usually from function |
iter |
Number of LQD-tanh iterations. |
Walter R. Mebane, Jr., University of Michigan,
wmebane@umich.edu, http://www-personal.umich.edu/~wmebane
Jasjeet S. Sekhon, UC Berkeley, sekhon@berkeley.edu, http://sekhon.berkeley.edu/
Walter R. Mebane, Jr. and Jasjeet Singh Sekhon. 2004. “Robust Estimation and Outlier Detection for Overdispersed Multinomial Models of Count Data.” American Journal of Political Science 48 (April): 391–410. http://sekhon.berkeley.edu/multinom.pdf
For additional documentation please visit http://sekhon.berkeley.edu/robust/.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 | # make some multinomial data
x1 <- rnorm(50);
x2 <- rnorm(50);
p1 <- exp(x1)/(1+exp(x1)+exp(x2));
p2 <- exp(x2)/(1+exp(x1)+exp(x2));
p3 <- 1 - (p1 + p2);
y <- matrix(0, 50, 3);
for (i in 1:50) {
y[i,] <- rmultinomial(1000, c(p1[i], p2[i], p3[i]));
}
# perturb the first 5 observations
y[1:5,c(1,2,3)] <- y[1:5,c(3,1,2)];
y1 <- y[,1];
y2 <- y[,2];
y3 <- y[,3];
# put data into a dataframe
dtf <- data.frame(x1, x2, y1, y2, y3);
## Set parameters for Genoud
## Not run:
## For production, use these kinds of parameters
zz.genoud.parms <- list( pop.size = 1000,
wait.generations = 10,
max.generations = 100,
scale.domains = 5,
print.level = 0
)
## End(Not run)
## For testing, we are setting the parmeters to run quickly. Don't use these for production
zz.genoud.parms <- list( pop.size = 10,
wait.generations = 1,
max.generations = 1,
scale.domains = 5,
print.level = 0
)
# estimate a model, with "y3" being the reference category
# true coefficient values are: (Intercept) = 0, x = 1
# impose an equality constraint
# equality constraint: coefficients of x1 and x2 are equal
mulrobE <- multinomRob(list(y1 ~ x1, y2 ~ x2, y3 ~ 0),
dtf,
equality = list(list(y1 ~ x1 + 0, y2 ~ x2 + 0)),
genoud.parms = zz.genoud.parms,
print.level = 3, iter=FALSE);
summary(mulrobE, weights=TRUE);
#Do only MLE estimation. The following model is NOT identified if we
#try to estimate the overdispersed MNL.
dtf <- data.frame(y1=c(1,1),y2=c(2,1),y3=c(1,2),x=c(0,1))
summary(multinomRob(list(y1 ~ 0, y2 ~ x, y3 ~ x), data=dtf, MLEonly=TRUE))
|
Loading required package: rgenoud
## rgenoud (Version 5.8-3.0, Build Date: 2019-01-22)
## See http://sekhon.berkeley.edu/rgenoud for additional documentation.
## Please cite software as:
## Walter Mebane, Jr. and Jasjeet S. Sekhon. 2011.
## ``Genetic Optimization Using Derivatives: The rgenoud package for R.''
## Journal of Statistical Software, 42(11): 1-26.
##
Loading required package: MASS
Loading required package: mvtnorm
##
## multinomRob (Version 1.8-6.1, Build Date: 2013/02/15)
## See http://sekhon.berkeley.edu/robust for additional documentation.
## Please cite as: Walter R. Mebane, Jr. and Jasjeet S. Sekhon. "Robust Estimation
## and Outlier Detection for Overdispersed Multinomial Models of Count Data".
## American Journal of Political Science, 48 (April): 391-410. 2004.
##
Equality constraints among parameters (after consolidation):
Equality constrained set 1
outcome y1 regressor x1
outcome y2 regressor x2
Your Model (xvec):
y1 y2 y3
(Intercept)/(Intercept)/NA 1 1 0
x1/x2/NA 2 2 0
multinomRob(): Grouped MNL Estimation
[1] "multinomMLE: hessian determinant: 973275091573.95"
[1] "multinomMLE: OPG determinant: 9175113261183712"
MNL LQD Fit: 2.903442
MNL Estimates:
y1 y2 y3
(Intercept)/(Intercept)/NA -0.07429061 -0.1130331 0
x1/x2/NA 0.86051068 0.8605107 0
MNL SEs:
y1 y2 y3
(Intercept)/(Intercept)/NA 0.011917062 0.011556630 0
x1/x2/NA 0.008477768 0.008477768 0
multinomRob(): Calculating multinomial-t starting values.
Multinom-T LQD Fit (step 0 ): 1.238336
Multinom-T Beta Estimates (step 0 ):
y1 y2 y3
(Intercept)/(Intercept)/NA -0.01209041 0.000718553 0
x1/x2/NA 1.00238612 1.002386121 0
Multinom-T Beta SEs (step 0 ):
y1 y2 y3
(Intercept)/(Intercept)/NA 0.012473124 0.012371458 0
x1/x2/NA 0.009725931 0.009725931 0
Multinom-T Omega Estimates (step 0 ):
y1 y2
y1 0.004353550 0.001643915
y2 0.001643915 0.004403493
Multinom-T DF (step 0 ): 1.270218
multinomRob(): Using multinomial-t estimates as starting values.
Multinom-T LQD Fit: 1.238336
Multinom-T Estimates:
y1 y2 y3
(Intercept)/(Intercept)/NA -0.01209041 0.000718553 0
x1/x2/NA 1.00238612 1.002386121 0
Multinom-T DF: 1.270218
multinomRob(): Starting Values
y1 y2 y3
(Intercept)/(Intercept)/NA -0.01209041 0.000718553 0
x1/x2/NA 1.00238612 1.002386121 0
multinomRob(): starting fit = 1.238336
LQD Results:
y1 y2 y3
(Intercept)/(Intercept)/NA -0.01209041 0.000718553 0
x1/x2/NA 1.00238612 1.002386121 0
LQD sigma: 1.238336
(multinomTanh):
[1] "mGNtanh: number of Newton iterations 3"
[1] "mGNtanh: number of Newton iterations 1"
[1] "mGNtanh: hessian determinant: 589955666969.14"
[1] "mGNtanh: OPG determinant: 326208933849.713"
[1] "mGNtanh: tanh sigma^2: 0.884399205724957"
Tanh Estimates
y1 y2 y3
(Intercept)/(Intercept)/NA -0.01092858 -0.001112076 0
x1/x2/NA 0.99845470 0.998454702 0
Tanh Sandwich SEs
y1 y2 y3
(Intercept)/(Intercept)/NA 0.010971825 0.011502659 0
x1/x2/NA 0.008788404 0.008788404 0
TANH sigma: 0.940425
Warning messages:
1: In fn(par, ...) : value out of range in 'lgamma'
2: In fn(par, ...) : value out of range in 'lgamma'
3: In fn(par, ...) : value out of range in 'lgamma'
Choice 1 : y1 Estimates and SE:
Est SE.Sand t.val.Sand
(Intercept) -0.0109 0.01100 -0.996
x1 0.9980 0.00879 114.000
Choice 2 : y2 Estimates and SE:
Est SE.Sand t.val.Sand
(Intercept) -0.00111 0.01150 -0.0967
x2 0.99800 0.00879 114.0000
Choice 3 : y3 Estimates and SE:
Est SE.Sand t.val.Sand
NA 0 0 NaN
NA 0 0 NaN
LQD sigma: 1.238336
TANH sigma: 0.940425
Number of Observations: 50
Number of observations with at least one zero weight: 5
Number of zero weights: 9
TANH: weights
name weights:y1 weights:y2
1 1 0.000000 0.000000
2 2 0.000000 0.899619
3 3 0.000000 0.000000
4 4 0.000000 0.000000
5 5 0.000000 0.000000
6 6 1.000000 1.000000
7 7 1.000000 1.000000
8 8 1.000000 1.000000
9 9 1.000000 1.000000
10 10 1.000000 1.000000
11 11 1.000000 1.000000
12 12 1.000000 1.000000
13 13 1.000000 1.000000
14 14 1.000000 1.000000
15 15 1.000000 1.000000
16 16 1.000000 1.000000
17 17 1.000000 1.000000
18 18 1.000000 1.000000
19 19 1.000000 1.000000
20 20 1.000000 1.000000
21 21 1.000000 1.000000
22 22 1.000000 1.000000
23 23 1.000000 1.000000
24 24 1.000000 1.000000
25 25 1.000000 1.000000
26 26 1.000000 1.000000
27 27 1.000000 1.000000
28 28 1.000000 1.000000
29 29 1.000000 1.000000
30 30 1.000000 0.590792
31 31 1.000000 1.000000
32 32 1.000000 1.000000
33 33 1.000000 1.000000
34 34 1.000000 1.000000
35 35 1.000000 1.000000
36 36 1.000000 1.000000
37 37 1.000000 1.000000
38 38 1.000000 1.000000
39 39 1.000000 1.000000
40 40 1.000000 1.000000
41 41 1.000000 1.000000
42 42 1.000000 1.000000
43 43 0.936756 1.000000
44 44 1.000000 1.000000
45 45 1.000000 1.000000
46 46 1.000000 1.000000
47 47 1.000000 1.000000
48 48 1.000000 1.000000
49 49 1.000000 1.000000
50 50 1.000000 1.000000
Choice 1 : y1 Estimates and SE:
Est SE.Sand t.val.Sand
NA 0 0 NaN
NA 0 0 NaN
Choice 2 : y2 Estimates and SE:
Est SE.Sand t.val.Sand
(Intercept) 0.693 1.22 0.566
x -0.693 1.87 -0.371
Choice 3 : y3 Estimates and SE:
Est SE.Sand t.val.Sand
(Intercept) 2.60e-16 1.41 1.84e-16
x 6.93e-01 1.87 3.71e-01
Residual Deviance: 16.63553
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.