Ordered Logistic or Probit Regression
Description
Fits a logistic or probit regression model to an ordered factor response. The default logistic case is proportional odds logistic regression, after which the function is named.
Usage
1 2 3 
Arguments
formula 
a formula expression as for regression models, of the form

data 
an optional data frame in which to interpret the variables occurring
in 
weights 
optional case weights in fitting. Default to 1. 
start 
initial values for the parameters. This is in the format

... 
additional arguments to be passed to 
subset 
expression saying which subset of the rows of the data should be used in the fit. All observations are included by default. 
na.action 
a function to filter missing data. 
contrasts 
a list of contrasts to be used for some or all of the factors appearing as variables in the model formula. 
Hess 
logical for whether the Hessian (the observed information matrix)
should be returned. Use this if you intend to call 
model 
logical for whether the model matrix should be returned. 
method 
logistic or probit or (complementary) loglog or cauchit (corresponding to a Cauchy latent variable). 
Details
This model is what Agresti (2002) calls a cumulative link model. The basic interpretation is as a coarsened version of a latent variable Y_i which has a logistic or normal or extremevalue or Cauchy distribution with scale parameter one and a linear model for the mean. The ordered factor which is observed is which bin Y_i falls into with breakpoints
zeta_0 = Inf < zeta_1 < … < zeta_K = Inf
This leads to the model
logit P(Y <= k  x) = zeta_k  eta
with logit replaced by probit for a normal latent
variable, and eta being the linear predictor, a linear
function of the explanatory variables (with no intercept). Note
that it is quite common for other software to use the opposite sign
for eta (and hence the coefficients beta
).
In the logistic case, the lefthand side of the last display is the log odds of category k or less, and since these are log odds which differ only by a constant for different k, the odds are proportional. Hence the term proportional odds logistic regression.
The loglog and complementary loglog links are the increasing functions F^1(p) = log(log(p)) and F^1(p) = log(log(1p)); some call the first the ‘negative loglog’ link. These correspond to a latent variable with the extremevalue distribution for the maximum and minimum respectively.
A proportional hazards model for grouped survival times can be obtained by using the complementary loglog link with grouping ordered by increasing times.
predict
, summary
, vcov
,
anova
, model.frame
and an
extractAIC
method for use with stepAIC
(and
step
). There are also profile
and
confint
methods.
Value
A object of class "polr"
. This has components
coefficients 
the coefficients of the linear predictor, which has no intercept. 
zeta 
the intercepts for the class boundaries. 
deviance 
the residual deviance. 
fitted.values 
a matrix, with a column for each level of the response. 
lev 
the names of the response levels. 
terms 
the 
df.residual 
the number of residual degrees of freedoms, calculated using the weights. 
edf 
the (effective) number of degrees of freedom used by the model 
n, nobs 
the (effective) number of observations, calculated using the
weights. ( 
call 
the matched call. 
method 
the matched method used. 
convergence 
the convergence code returned by 
niter 
the number of function and gradient evaluations used by

lp 
the linear predictor (including any offset). 
Hessian 
(if 
model 
(if 
Note
The vcov
method uses the approximate Hessian: for
reliable results the model matrix should be sensibly scaled with all
columns having range the order of one.
Prior to version 7.332, method = "cloglog"
confusingly gave
the loglog link, implicitly assuming the first response level was the
‘best’.
References
Agresti, A. (2002) Categorical Data. Second edition. Wiley.
Venables, W. N. and Ripley, B. D. (2002) Modern Applied Statistics with S. Fourth edition. Springer.
See Also
optim
, glm
, multinom
.
Examples
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21  options(contrasts = c("contr.treatment", "contr.poly"))
house.plr < polr(Sat ~ Infl + Type + Cont, weights = Freq, data = housing)
house.plr
summary(house.plr, digits = 3)
## slightly worse fit from
summary(update(house.plr, method = "probit", Hess = TRUE), digits = 3)
## although it is not really appropriate, can fit
summary(update(house.plr, method = "loglog", Hess = TRUE), digits = 3)
summary(update(house.plr, method = "cloglog", Hess = TRUE), digits = 3)
predict(house.plr, housing, type = "p")
addterm(house.plr, ~.^2, test = "Chisq")
house.plr2 < stepAIC(house.plr, ~.^2)
house.plr2$anova
anova(house.plr, house.plr2)
house.plr < update(house.plr, Hess=TRUE)
pr < profile(house.plr)
confint(pr)
plot(pr)
pairs(pr)
