# QvarUC: Quasi-variances Preprocessing Function In VGAM: Vector Generalized Linear and Additive Models

## Description

Takes a `vglm` fit or a variance-covariance matrix, and preprocesses it for `rcim` and `uninormal` so that quasi-variances can be computed.

## Usage

 ```1 2 3``` ```Qvar(object, factorname = NULL, which.linpred = 1, coef.indices = NULL, labels = NULL, dispersion = NULL, reference.name = "(reference)", estimates = NULL) ```

## Arguments

 `object` A `"vglm"` object or a variance-covariance matrix, e.g., `vcov(vglm.object)`. The former is preferred since it contains all the information needed. If a matrix then `factorname` and/or `coef.indices` should be specified to identify the factor. `which.linpred` A single integer from the set `1:M`. Specifies which linear predictor to use. Let the value of `which.linpred` be called j. Then the factor should appear in that linear predictor, hence the jth row of the constraint matrix corresponding to the factor should have at least one nonzero value. Currently the jth row must have exactly one nonzero value because programming it for more than one nonzero value is difficult. `factorname` Character. If the `vglm` object contains more than one factor as explanatory variable then this argument should be the name of the factor of interest. If `object` is a variance-covariance matrix then this argument should also be specified. `labels` Character. Optional, for labelling the variance-covariance matrix. `dispersion` Numeric. Optional, passed into `vcov()` with the same argument name. `reference.name` Character. Label for for the reference level. `coef.indices` Optional numeric vector of length at least 3 specifying the indices of the factor from the variance-covariance matrix. `estimates` an optional vector of estimated coefficients (redundant if `object` is a model).

## Details

Suppose a factor with L levels is an explanatory variable in a regression model. By default, R treats the first level as baseline so that its coefficient is set to zero. It estimates the other L-1 coefficients, and with its associated standard errors, this is the conventional output. From the complete variance-covariance matrix one can compute L quasi-variances based on all pairwise difference of the coefficients. They are based on an approximation, and can be treated as uncorrelated. In minimizing the relative (not absolute) errors it is not hard to see that the estimation involves a RCIM (`rcim`) with an exponential link function (`explink`).

If `object` is a model, then at least one of `factorname` or `coef.indices` must be non-`NULL`. The value of `coef.indices`, if non-`NULL`, determines which rows and columns of the model's variance-covariance matrix to use. If `coef.indices` contains a zero, an extra row and column are included at the indicated position, to represent the zero variances and covariances associated with a reference level. If `coef.indices` is `NULL`, then `factorname` should be the name of a factor effect in the model, and is used in order to extract the necessary variance-covariance estimates.

Quasi-variances were first implemented in R with qvcalc. This implementation draws heavily from that.

## Value

A L by L matrix whose i-j element is the logarithm of the variance of the ith coefficient minus the jth coefficient, for all values of i and j. The diagonal elements are abitrary and are set to zero.

The matrix has an attribute that corresponds to the prior weight matrix; it is accessed by `uninormal` and replaces the usual `weights` argument. of `vglm`. This weight matrix has ones on the off-diagonals and some small positive number on the diagonals.

## Warning

Negative quasi-variances may occur (one of them and only one), though they are rare in practice. If so then numerical problems may occur. See `qvcalc()` for more information.

## Note

This is an adaptation of `qvcalc()` in qvcalc. It should work for all `vglm` models with one linear predictor, i.e., M = 1. For M > 1 the factor should appear only in one of the linear predictors.

It is important to set `maxit` to be larger than usual for `rcim` since convergence is slow. Upon successful convergence the ith row effect and the ith column effect should be equal. A simple computation involving the fitted and predicted values allows the quasi-variances to be extracted (see example below).

A function to plot comparison intervals has not been written here.

## Author(s)

T. W. Yee, based heavily on `qvcalc()` in qvcalc written by David Firth.

## References

Firth, D. (2003). Overcoming the reference category problem in the presentation of statistical models. Sociological Methodology 33, 1–18.

Firth, D. and de Menezes, R. X. (2004). Quasi-variances. Biometrika 91, 65–80.

Yee, T. W. and Hadi, A. F. (2014). Row-column interaction models, with an R implementation. Computational Statistics, 29, 1427–1445.

`rcim`, `vglm`, `qvar`, `uninormal`, `explink`, `qvcalc()` in qvcalc, `ships`.
 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59``` ```# Example 1 data("ships", package = "MASS") Shipmodel <- vglm(incidents ~ type + year + period, poissonff, offset = log(service), # trace = TRUE, model = TRUE, data = ships, subset = (service > 0)) # Easiest form of input fit1 <- rcim(Qvar(Shipmodel, "type"), uninormal("explink"), maxit = 99) qvar(fit1) # Easy method to get the quasi-variances qvar(fit1, se = TRUE) # Easy method to get the quasi-standard errors (quasiVar <- exp(diag(fitted(fit1))) / 2) # Version 1 (quasiVar <- diag(predict(fit1)[, c(TRUE, FALSE)]) / 2) # Version 2 (quasiSE <- sqrt(quasiVar)) # Another form of input fit2 <- rcim(Qvar(Shipmodel, coef.ind = c(0, 2:5), reference.name = "typeA"), uninormal("explink"), maxit = 99) ## Not run: qvplot(fit2, col = "green", lwd = 3, scol = "blue", slwd = 2, las = 1) # The variance-covariance matrix is another form of input (not recommended) fit3 <- rcim(Qvar(cbind(0, rbind(0, vcov(Shipmodel)[2:5, 2:5])), labels = c("typeA", "typeB", "typeC", "typeD", "typeE"), estimates = c(typeA = 0, coef(Shipmodel)[2:5])), uninormal("explink"), maxit = 99) (QuasiVar <- exp(diag(fitted(fit3))) / 2) # Version 1 (QuasiVar <- diag(predict(fit3)[, c(TRUE, FALSE)]) / 2) # Version 2 (QuasiSE <- sqrt(quasiVar)) ## Not run: qvplot(fit3) # Example 2: a model with M > 1 linear predictors ## Not run: require("VGAMdata") xs.nz.f <- subset(xs.nz, sex == "F") xs.nz.f <- subset(xs.nz.f, !is.na(babies) & !is.na(age) & !is.na(ethnicity)) xs.nz.f <- subset(xs.nz.f, ethnicity != "Other") clist <- list("sm.bs(age, df = 4)" = rbind(1, 0), "sm.bs(age, df = 3)" = rbind(0, 1), "ethnicity" = diag(2), "(Intercept)" = diag(2)) fit1 <- vglm(babies ~ sm.bs(age, df = 4) + sm.bs(age, df = 3) + ethnicity, zipoissonff(zero = NULL), xs.nz.f, constraints = clist, trace = TRUE) Fit1 <- rcim(Qvar(fit1, "ethnicity", which.linpred = 1), uninormal("explink", imethod = 1), maxit = 99, trace = TRUE) Fit2 <- rcim(Qvar(fit1, "ethnicity", which.linpred = 2), uninormal("explink", imethod = 1), maxit = 99, trace = TRUE) ## End(Not run) ## Not run: par(mfrow = c(1, 2)) qvplot(Fit1, scol = "blue", pch = 16, main = expression(eta), slwd = 1.5, las = 1, length.arrows = 0.07) qvplot(Fit2, scol = "blue", pch = 16, main = expression(eta), slwd = 1.5, las = 1, length.arrows = 0.07) ## End(Not run) ```