# clm: Cumulative Link Models In ordinal: Regression Models for Ordinal Data

## Description

Fits cumulative link models (CLMs) such as the propotional odds model. The model allows for various link functions and structured thresholds that restricts the thresholds or cut-points to be e.g., equidistant or symmetrically arranged around the central threshold(s). Nominal effects (partial proportional odds with the logit link) are also allowed. A modified Newton algorithm is used to optimize the likelihood function.

## Usage

 ```1 2 3 4 5``` ```clm(formula, scale, nominal, data, weights, start, subset, doFit = TRUE, na.action, contrasts, model = TRUE, control=list(), link = c("logit", "probit", "cloglog", "loglog", "cauchit", "Aranda-Ordaz", "log-gamma"), threshold = c("flexible", "symmetric", "symmetric2", "equidistant"), ...) ```

## Arguments

 `formula` a formula expression as for regression models, of the form `response ~ predictors`. The response should be a factor (preferably an ordered factor), which will be interpreted as an ordinal response with levels ordered as in the factor. The model must have an intercept: attempts to remove one will lead to a warning and will be ignored. An offset may be used. See the documentation of `formula` for other details. `scale` an optional formula expression, of the form ` ~ predictors`, i.e. with an empty left hand side. An offset may be used. Variables included here will have multiplicative effects and can be interpreted as effects on the scale (or dispersion) of a latent distribution. `nominal` an optional formula of the form ` ~ predictors`, i.e. with an empty left hand side. The effects of the predictors in this formula are assumed to be nominal rather than ordinal - this corresponds to the so-called partial proportional odds (with the logit link). `data` an optional data frame in which to interpret the variables occurring in the formulas. `weights` optional case weights in fitting. Defaults to 1. Negative weights are not allowed. `start` initial values for the parameters in the format `c(alpha, beta, zeta)`, where `alpha` are the threshold parameters (adjusted for potential nominal effects), `beta` are the regression parameters and `zeta` are the scale parameters. `subset` expression saying which subset of the rows of the data should be used in the fit. All observations are included by default. `doFit` logical for whether the model should be fitted or the model environment should be returned. `na.action` a function to filter missing data. Applies to terms in all three formulae. `contrasts` a list of contrasts to be used for some or all of the factors appearing as variables in the model formula. `model` logical for whether the model frame should be part of the returned object. `control` a list of control parameters passed on to `clm.control`. `link` link function, i.e., the type of location-scale distribution assumed for the latent distribution. The default `"logit"` link gives the proportional odds model. `threshold` specifies a potential structure for the thresholds (cut-points). `"flexible"` provides the standard unstructured thresholds, `"symmetric"` restricts the distance between the thresholds to be symmetric around the central one or two thresholds for odd or equal numbers or thresholds respectively, `"symmetric2"` restricts the latent mean in the reference group to zero; this means that the central threshold (even no. response levels) is zero or that the two central thresholds are equal apart from their sign (uneven no. response levels), and `"equidistant"` restricts the distance between consecutive thresholds to be of the same size. `...` additional arguments are passed on to `clm.control`.

## Details

This is a new (as of August 2011) improved implementation of CLMs. The old implementation is available in `clm2`, but will probably be removed at some point.

There are methods for the standard model-fitting functions, including `summary`, `anova`, `model.frame`, `model.matrix`, `drop1`, `dropterm`, `step`, `stepAIC`, `extractAIC`, `AIC`, `coef`, `nobs`, `profile`, `confint`, `vcov` and `slice`.

## Value

If `doFit = FALSE` the result is an environment representing the model ready to be optimized. If `doFit = TRUE` the result is an object of class `"clm"` with the components listed below.

Note that some components are only present if `scale` and `nominal` are used.

 `aliased` list of length 3 or less with components `alpha`, `beta` and `zeta` each being logical vectors containing alias information for the parameters of the same names. `alpha` a vector of threshold parameters. `alpha.mat` (where relevant) a table (`data.frame`) of threshold parameters where each row corresponds to an effect in the `nominal` formula. `beta` (where relevant) a vector of regression parameters. `call` the mathed call. `coefficients` a vector of coefficients of the form `c(alpha, beta, zeta)` `cond.H` condition number of the Hessian matrix at the optimum (i.e. the ratio of the largest to the smallest eigenvalue). `contrasts` (where relevant) the contrasts used for the `formula` part of the model. `control` list of control parameters as generated by `clm.control`. `convergence` convergence code where 0 indicates successful convergence and negative values indicate convergence failure; 1 indicates successful convergence to a non-unique optimum. `edf` the estimated degrees of freedom, i.e., the number of parameters in the model fit. `fitted.values` the fitted probabilities. `gradient` a vector of gradients for the coefficients at the estimated optimum. `Hessian` the Hessian matrix for the parameters at the estimated optimum. `info` a table of basic model information for printing. `link` character, the link function used. `logLik` the value of the log-likelihood at the estimated optimum. `maxGradient` the maximum absolute gradient, i.e., `max(abs(gradient))`. `model` if requested (the default), the `model.frame` containing variables from `formula`, `scale` and `nominal` parts. `n` the number of observations counted as `nrow(X)`, where `X` is the design matrix. `na.action` (where relevant) information returned by `model.frame` on the special handling of `NA`s. `nobs` the number of observations counted as `sum(weights)`. `nom.contrasts` (where relevant) the contrasts used for the `nominal` part of the model. `nom.terms` (where relevant) the terms object for the `nominal` part. `nom.xlevels` (where relevant) a record of the levels of the factors used in fitting for the `nominal` part. `start` the parameter values at which the optimization has started. An attribute `start.iter` gives the number of iterations to obtain starting values for models where `scale` is specified or where the `cauchit` link is chosen. `S.contrasts` (where relevant) the contrasts used for the `scale` part of the model. `S.terms` (where relevant) the terms object for the `scale` part. `S.xlevels` (where relevant) a record of the levels of the factors used in fitting for the `scale` part. `terms` the terms object for the `formula` part. `Theta` (where relevant) a table (`data.frame`) of thresholds for all combinations of levels of factors in the `nominal` formula. `threshold` character, the threshold structure used. `tJac` the transpose of the Jacobian for the threshold structure. `xlevels` (where relevant) a record of the levels of the factors used in fitting for the `formula` part. `y.levels` the levels of the response variable after removing levels for which all weights are zero. `zeta` (where relevant) a vector of scale regression parameters.

## Author(s)

Rune Haubo B Christensen

## Examples

 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70``` ```fm1 <- clm(rating ~ temp * contact, data = wine) fm1 ## print method summary(fm1) fm2 <- update(fm1, ~.-temp:contact) anova(fm1, fm2) drop1(fm1, test = "Chi") add1(fm1, ~.+judge, test = "Chi") fm2 <- step(fm1) summary(fm2) coef(fm1) vcov(fm1) AIC(fm1) extractAIC(fm1) logLik(fm1) fitted(fm1) confint(fm1) ## type = "profile" confint(fm1, type = "Wald") pr1 <- profile(fm1) confint(pr1) ## plotting the profiles: par(mfrow = c(2, 2)) plot(pr1, root = TRUE) ## check for linearity par(mfrow = c(2, 2)) plot(pr1) par(mfrow = c(2, 2)) plot(pr1, approx = TRUE) par(mfrow = c(2, 2)) plot(pr1, Log = TRUE) par(mfrow = c(2, 2)) plot(pr1, Log = TRUE, relative = FALSE) ## other link functions: fm4.lgt <- update(fm1, link = "logit") ## default fm4.prt <- update(fm1, link = "probit") fm4.ll <- update(fm1, link = "loglog") fm4.cll <- update(fm1, link = "cloglog") fm4.cct <- update(fm1, link = "cauchit") anova(fm4.lgt, fm4.prt, fm4.ll, fm4.cll, fm4.cct) ## structured thresholds: fm5 <- update(fm1, threshold = "symmetric") fm6 <- update(fm1, threshold = "equidistant") anova(fm1, fm5, fm6) ## the slice methods: slice.fm1 <- slice(fm1) par(mfrow = c(3, 3)) plot(slice.fm1) ## see more at '?slice.clm' ## Another example: fm.soup <- clm(SURENESS ~ PRODID, data = soup) summary(fm.soup) if(require(MASS)) { ## dropterm, addterm, stepAIC, housing fm1 <- clm(rating ~ temp * contact, data = wine) dropterm(fm1, test = "Chi") addterm(fm1, ~.+judge, test = "Chi") fm3 <- stepAIC(fm1) summary(fm3) ## Example from MASS::polr: fm1 <- clm(Sat ~ Infl + Type + Cont, weights = Freq, data = housing) summary(fm1) } ```

