A reduced-rank vector generalized linear model (RR-VGLM) is fitted. RR-VGLMs are VGLMs but some of the constraint matrices are estimated. In this documentation, M is the number of linear predictors.
1 2 3 4 5 6 7
rrvglm(formula, family = stop("argument 'family' needs to be assigned"), data = list(), weights = NULL, subset = NULL, na.action = na.fail, etastart = NULL, mustart = NULL, coefstart = NULL, control = rrvglm.control(...), offset = NULL, method = "rrvglm.fit", model = FALSE, x.arg = TRUE, y.arg = TRUE, contrasts = NULL, constraints = NULL, extra = NULL, qr.arg = FALSE, smart = TRUE, ...)
an optional data frame containing the variables in the model.
By default the variables are taken from
a list of parameters for controlling the fitting process.
the method to be used in fitting the model.
The default (and presently only) method
logical values indicating whether
the model matrix and response vector/matrix used in the fitting
process should be assigned in the
further arguments passed into
The central formula is given by
eta = B_1^T x_1 + A nu
where x1 is a vector (usually just a 1 for an intercept), x2 is another vector of explanatory variables, and nu = C^T x_2 is an R-vector of latent variables. Here, eta is a vector of linear predictors, e.g., the mth element is eta_m = log(E[Y_m]) for the mth Poisson response. The matrices B_1, A and C are estimated from the data, i.e., contain the regression coefficients. For ecologists, the central formula represents a constrained linear ordination (CLO) since it is linear in the latent variables. It means that the response is a monotonically increasing or decreasing function of the latent variables.
For identifiability it is common to enforce corner constraints on A: by default, the top R by R submatrix is fixed to be the order-R identity matrix and the remainder of A is estimated.
The underlying algorithm of RR-VGLMs is iteratively reweighted least squares (IRLS) with an optimizing algorithm applied within each IRLS iteration (e.g., alternating algorithm).
In theory, any VGAM family function that works for
vgam should work
The function that actually does the work is
vglm.fit with some extra code.
An object of class
"rrvglm", which has the the same
slots as a
"vglm" object. The only difference is
that the some of the constraint matrices are estimates
rather than known. But VGAM stores the models the
same internally. The slots of
"vglm" objects are
The arguments of
rrvglm are in general the same as
vglm but with some extras in
The smart prediction (
is packed with the VGAM library.
In an example below, a rank-1 stereotype
model of Anderson (1984) is fitted to some car data.
The reduced-rank regression is performed, adjusting for
two covariates. Setting a trivial constraint matrix
for the latent variable variables in x2 avoids
a warning message when it is overwritten by a (common)
estimated constraint matrix. It shows that German cars
tend to be more expensive than American cars, given a
car of fixed weight and width.
fit <- rrvglm(..., data = mydata) then
summary(fit) requires corner constraints and no
missing values in
mydata. Often the estimated
variance-covariance matrix of the parameters is not
positive-definite; if this occurs, try refitting the
model with a different value for
For constrained quadratic ordination (CQO) see
cqo for more details about QRR-VGLMs.
With multiple binary responses, one must use
binomialff(multiple.responses = TRUE) to indicate
that the response is a matrix with one response per column.
Otherwise, it is interpreted as a single binary response
Thomas W. Yee
Yee, T. W. and Hastie, T. J. (2003). Reduced-rank vector generalized linear models. Statistical Modelling, 3, 15–41.
Yee, T. W. (2004). A new technique for maximum-likelihood canonical Gaussian ordination. Ecological Monographs, 74, 685–701.
Anderson, J. A. (1984). Regression and ordered categorical variables. Journal of the Royal Statistical Society, Series B, Methodological, 46, 1–30.
Yee, T. W. (2014). Reduced-rank vector generalized linear models with two linear predictors. Computational Statistics and Data Analysis, 71, 889–902.
Special family functions include
(see Yee (2014) and COZIGAM).
Methods functions include
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59
## Not run: # Example 1: RR negative binomial with Var(Y) = mu + delta1 * mu^delta2 nn <- 1000 # Number of observations delta1 <- 3.0 # Specify this delta2 <- 1.5 # Specify this; should be greater than unity a21 <- 2 - delta2 mydata <- data.frame(x2 = runif(nn), x3 = runif(nn)) mydata <- transform(mydata, mu = exp(2 + 3 * x2 + 0 * x3)) mydata <- transform(mydata, y2 = rnbinom(nn, mu = mu, size = (1/delta1)*mu^a21)) plot(y2 ~ x2, data = mydata, pch = "+", col = 'blue', las = 1, main = paste("Var(Y) = mu + ", delta1, " * mu^", delta2, sep = "")) rrnb2 <- rrvglm(y2 ~ x2 + x3, negbinomial(zero = NULL), data = mydata, trace = TRUE) a21.hat <- (Coef(rrnb2)@A)["loglink(size)", 1] beta11.hat <- Coef(rrnb2)@B1["(Intercept)", "loglink(mu)"] beta21.hat <- Coef(rrnb2)@B1["(Intercept)", "loglink(size)"] (delta1.hat <- exp(a21.hat * beta11.hat - beta21.hat)) (delta2.hat <- 2 - a21.hat) # exp(a21.hat * predict(rrnb2)[1,1] - predict(rrnb2)[1,2]) # delta1.hat summary(rrnb2) # Obtain a 95 percent confidence interval for delta2: se.a21.hat <- sqrt(vcov(rrnb2)["I(latvar.mat)", "I(latvar.mat)"]) ci.a21 <- a21.hat + c(-1, 1) * 1.96 * se.a21.hat (ci.delta2 <- 2 - rev(ci.a21)) # The 95 percent confidence interval Confint.rrnb(rrnb2) # Quick way to get it # Plot the abundances and fitted values against the latent variable plot(y2 ~ latvar(rrnb2), data = mydata, col = "blue", xlab = "Latent variable", las = 1) ooo <- order(latvar(rrnb2)) lines(fitted(rrnb2)[ooo] ~ latvar(rrnb2)[ooo], col = "orange") # Example 2: stereotype model (reduced-rank multinomial logit model) data(car.all) scar <- subset(car.all, is.element(Country, c("Germany", "USA", "Japan", "Korea"))) fcols <- c(13,14,18:20,22:26,29:31,33,34,36) # These are factors scar[, -fcols] <- scale(scar[, -fcols]) # Standardize all numerical vars ones <- matrix(1, 3, 1) clist <- list("(Intercept)" = diag(3), Width = ones, Weight = ones, Disp. = diag(3), Tank = diag(3), Price = diag(3), Frt.Leg.Room = diag(3)) set.seed(111) fit <- rrvglm(Country ~ Width + Weight + Disp. + Tank + Price + Frt.Leg.Room, multinomial, data = scar, Rank = 2, trace = TRUE, constraints = clist, noRRR = ~ 1 + Width + Weight, Uncor = TRUE, Corner = FALSE, Bestof = 2) fit@misc$deviance # A history of the fits Coef(fit) biplot(fit, chull = TRUE, scores = TRUE, clty = 2, Ccex = 2, ccol = "blue", scol = "orange", Ccol = "darkgreen", Clwd = 2, main = "1=Germany, 2=Japan, 3=Korea, 4=USA") ## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.